Message ID | 20240611230649.1473-1-egyszeregy@freemail.hu |
---|---|
State | New |
Headers | show |
Series | [v3] fetch2/wget: Add user_agent parameter so it can be used optionally | expand |
It helps to include all relevant links into the commit message, e.g. the issues with xilinx's PMU_ROM that you had: https://support.xilinx.com/s/question/0D54U00008RolRMSAZ/yocto-metaxilinxcore-cannot-find-pmuromnative-url?language=en_US I just tried to reproduce the locally, and there is no issue with wget: alex@alex-lx-laptop:~$ wget https://www.xilinx.com/bin/public/openDownload?filename=PMU_ROM.tar.gz --2024-06-12 09:29:55-- https://www.xilinx.com/bin/public/openDownload?filename=PMU_ROM.tar.gz Resolving www.xilinx.com (www.xilinx.com)... 184.25.239.57, 184.25.239.48 Connecting to www.xilinx.com (www.xilinx.com)|184.25.239.57|:443... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: https://xilinx-ax-dl.entitlenow.com/dl/ul/2021/12/20/R210497260/PMU_ROM.tar.gz?hash=oU3pGSFXZ2pq00I2gGZX-g&expires=1718209784&filename=PMU_ROM.tar.gz [following] --2024-06-12 09:29:56-- https://xilinx-ax-dl.entitlenow.com/dl/ul/2021/12/20/R210497260/PMU_ROM.tar.gz?hash=oU3pGSFXZ2pq00I2gGZX-g&expires=1718209784&filename=PMU_ROM.tar.gz Resolving xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)... 184.25.239.24, 184.25.239.83 Connecting to xilinx-ax-dl.entitlenow.com (xilinx-ax-dl.entitlenow.com)|184.25.239.24|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 8031 (7.8K) [application/x-gzip] Saving to: ‘openDownload?filename=PMU_ROM.tar.gz’ openDownload?filename=PMU_ROM.tar.gz 100%[=====================================================================================================================>] 7.84K --.-KB/s in 0s 2024-06-12 09:29:57 (87.0 MB/s) - ‘openDownload?filename=PMU_ROM.tar.gz’ saved [8031/8031] The sha25sum matches what is in the recipe: alex@alex-lx-laptop:~$ sha256sum openDownload\?filename\=PMU_ROM.tar.gz f9a450ef960979463ea0a87a35fafb4a5b62d3a741de30cbcef04c8edc22a7cf openDownload?filename=PMU_ROM.tar.gz The report indicated that there was a broken redirect via amd.com (which does require browser user-agent), but this seems to be no longer happening. Can you check? All of this is to say, I'm still reluctant to support making the code more complex when there is no proven need for it. Alex On Wed, 12 Jun 2024 at 01:07, Livius via lists.yoctoproject.org <egyszeregy=freemail.hu@lists.yoctoproject.org> wrote: > > From: Benjamin Szőke <egyszeregy@freemail.hu> > > Add "user_agent" optional parameter for wget fetcher to able > to use it if HTTP servers block requests with the default wget > user-agent. > > Signed-off-by: Benjamin Szőke <egyszeregy@freemail.hu> > --- > conf/bitbake.conf | 1 + > .../bitbake-user-manual-fetching.rst | 23 +++++++++++++------ > .../bitbake-user-manual-ref-variables.rst | 5 ++++ > lib/bb/fetch2/wget.py | 11 +++++---- > 4 files changed, 28 insertions(+), 12 deletions(-) > > diff --git a/conf/bitbake.conf b/conf/bitbake.conf > index f5a5a333a..a16f72d25 100644 > --- a/conf/bitbake.conf > +++ b/conf/bitbake.conf > @@ -44,3 +44,4 @@ TARGET_ARCH = "${BUILD_ARCH}" > TMPDIR = "${TOPDIR}/tmp" > WORKDIR = "${TMPDIR}/work/${PF}" > GITPKGV = "${@bb.fetch2.get_srcrev(d, 'gitpkgv_revision')}" > +BB_FETCH_USER_AGENT ??= "bitbake/${BB_VERSION}" > diff --git a/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst b/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst > index fb4f0a23d..0ee07992f 100644 > --- a/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst > +++ b/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst > @@ -221,13 +221,21 @@ HTTP/FTP wget fetcher (``http://``, ``ftp://``, ``https://``) > This fetcher obtains files from web and FTP servers. Internally, the > fetcher uses the wget utility. > > -The executable and parameters used are specified by the > -``FETCHCMD_wget`` variable, which defaults to sensible values. The > -fetcher supports a parameter "downloadfilename" that allows the name of > -the downloaded file to be specified. Specifying the name of the > -downloaded file is useful for avoiding collisions in > -:term:`DL_DIR` when dealing with multiple files that > -have the same name. > +The executable and parameters used are specified by the ``FETCHCMD_wget`` > +variable, which defaults to sensible values. The fetcher supports > +parameters: > + > +- *downloadfilename:* That allows the name of the downloaded file > + to be specified. > + > +- *user_agent:* Enable to use a default ``bitbake/${BB_VERSION}`` user-agent > + which is defined in :term:`BB_FETCH_USER_AGENT`. > + > +Specifying the name of the downloaded file is useful for avoiding > +collisions in :term:`DL_DIR` when dealing with multiple files > +that have the same name. A few HTTP servers block requests with > +the default wget user-agent, in this case specifying a valid > +user-agent can solve this issue. > > If a username and password are specified in the ``SRC_URI``, a Basic > Authorization header will be added to each request, including across redirects. > @@ -239,6 +247,7 @@ Some example URLs are as follows:: > SRC_URI = "http://oe.handhelds.org/not_there.aac" > SRC_URI = "ftp://oe.handhelds.org/not_there_as_well.aac" > SRC_URI = "ftp://you@oe.handhelds.org/home/you/secret.plan" > + SRC_URI = "https://oe.handhelds.org/not_there.aac;user_agent=1" > > .. note:: > > diff --git a/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst b/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst > index 899e584f9..3f310bd72 100644 > --- a/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst > +++ b/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst > @@ -699,6 +699,11 @@ overview of their function and contents. > Within an executing task, this variable holds the hash of the task as > returned by the currently enabled signature generator. > > + :term:`BB_FETCH_USER_AGENT` > + Specifies a user-agent string which BitBake uses if "user_agent" > + parameter is enabled for HTTP/FTP wget fetcher. Default value can > + be found in ``conf/bitbake.conf``. > + > :term:`BB_VERBOSE_LOGS` > Controls how verbose BitBake is during builds. If set, shell scripts > echo commands and shell script output appears on standard out > diff --git a/lib/bb/fetch2/wget.py b/lib/bb/fetch2/wget.py > index 2e9211763..30a56df71 100644 > --- a/lib/bb/fetch2/wget.py > +++ b/lib/bb/fetch2/wget.py > @@ -52,11 +52,8 @@ class WgetProgressHandler(bb.progress.LineFilterProgressHandler): > > class Wget(FetchMethod): > """Class to fetch urls via 'wget'""" > - > - # CDNs like CloudFlare may do a 'browser integrity test' which can fail > - # with the standard wget/urllib User-Agent, so pretend to be a modern > - # browser. > - user_agent = "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" > + def init(self, d): > + self.user_agent = d.getVar("BB_FETCH_USER_AGENT") > > def check_certs(self, d): > """ > @@ -89,6 +86,10 @@ class Wget(FetchMethod): > > self.basecmd = d.getVar("FETCHCMD_wget") or "/usr/bin/env wget -t 2 -T 30" > > + is_user_agent_enabled = ud.parm.get("user_agent","0") == "1" > + if is_user_agent_enabled: > + self.basecmd += f" --user-agent='{self.user_agent}'" > + > if ud.type == 'ftp' or ud.type == 'ftps': > self.basecmd += " --passive-ftp" > > -- > 2.45.2.windows.1 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#5294): https://lists.yoctoproject.org/g/docs/message/5294 > Mute This Topic: https://lists.yoctoproject.org/mt/106623550/1686489 > Group Owner: docs+owner@lists.yoctoproject.org > Unsubscribe: https://lists.yoctoproject.org/g/docs/unsub [alex.kanavin@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
PMU_ROM downloading issue was solved by Xilinx. https://lists.yoctoproject.org/g/meta-xilinx/topic/106394220#msg5364 There was a commit (which was reverted later) where this issue was also reported for openjdk downloading. It is still not working via wget. https://github.com/openembedded/bitbake/commit/d6fa261a9603677f0b3abbd309c1ca6073b63f4c In openjdk-8 272 recipe all of link are broken if wget downloading doeas not use any custom user-agent. https://layers.openembedded.org/layerindex/recipe/44301/ does not work -> wget https://hg.openjdk.java.net/jdk8u/jdk8u/langtools/archive/jdk8u272-ga.tar.bz2 works -> wget --user-agent='bitbake/2.8.0' https://hg.openjdk.java.net/jdk8u/jdk8u/langtools/archive/jdk8u272-ga.tar.bz2 So it seems this patch is neccesery provide and able to solve this kind of issues in the future.
On Sat, 15 Jun 2024 at 13:01, Livius via lists.openembedded.org <egyszeregy=freemail.hu@lists.openembedded.org> wrote: > In openjdk-8 272 recipe all of link are broken if wget downloading doeas not use any custom user-agent. > https://layers.openembedded.org/layerindex/recipe/44301/ > > does not work -> wget https://hg.openjdk.java.net/jdk8u/jdk8u/langtools/archive/jdk8u272-ga.tar.bz2 > works -> wget --user-agent='bitbake/2.8.0' https://hg.openjdk.java.net/jdk8u/jdk8u/langtools/archive/jdk8u272-ga.tar.bz2 The layer is using an obsolete download location. The correct location is in the release announcement: https://mail.openjdk.org/pipermail/jdk8u-dev/2020-October/012817.html and it does work with wget: https://openjdk-sources.osci.io/openjdk8/openjdk8u272-ga.tar.xz The complete list of releases is at: https://wiki.openjdk.org/display/jdk8u/Main It's also likely tarball downloads with wget from mercurial VCS were explicitly disabled; trying to fake user agent will not go down well if that's the case. Please dig a bit deeper instead of giving up at the 'wget does not work' step. > So it seems this patch is necessary to provide and it able to solve this kind of issues in the future. I'm still not convinced it is necessary. Both known incidents can be solved without the patch. Alex
diff --git a/conf/bitbake.conf b/conf/bitbake.conf index f5a5a333a..a16f72d25 100644 --- a/conf/bitbake.conf +++ b/conf/bitbake.conf @@ -44,3 +44,4 @@ TARGET_ARCH =3D "${BUILD_ARCH}" TMPDIR =3D "${TOPDIR}/tmp" WORKDIR =3D "${TMPDIR}/work/${PF}" GITPKGV =3D "${@bb.fetch2.get_srcrev(d, 'gitpkgv_revision')}" +BB_FETCH_USER_AGENT ??=3D "bitbake/${BB_VERSION}" diff --git a/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst b/d= oc/bitbake-user-manual/bitbake-user-manual-fetching.rst index fb4f0a23d..0ee07992f 100644 --- a/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst +++ b/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst @@ -221,13 +221,21 @@ HTTP/FTP wget fetcher (``http://``, ``ftp://``, ``h= ttps://``) This fetcher obtains files from web and FTP servers. Internally, the fetcher uses the wget utility. =20 -The executable and parameters used are specified by the -``FETCHCMD_wget`` variable, which defaults to sensible values. The -fetcher supports a parameter "downloadfilename" that allows the name of -the downloaded file to be specified. Specifying the name of the -downloaded file is useful for avoiding collisions in -:term:`DL_DIR` when dealing with multiple files that -have the same name. +The executable and parameters used are specified by the ``FETCHCMD_wget`= ` +variable, which defaults to sensible values. The fetcher supports +parameters: + +- *downloadfilename:* That allows the name of the downloaded file + to be specified. + +- *user_agent:* Enable to use a default ``bitbake/${BB_VERSION}`` user-= agent + which is defined in :term:`BB_FETCH_USER_AGENT`. + +Specifying the name of the downloaded file is useful for avoiding +collisions in :term:`DL_DIR` when dealing with multiple files +that have the same name. A few HTTP servers block requests with +the default wget user-agent, in this case specifying a valid +user-agent can solve this issue. =20 If a username and password are specified in the ``SRC_URI``, a Basic Authorization header will be added to each request, including across red= irects. @@ -239,6 +247,7 @@ Some example URLs are as follows:: SRC_URI =3D "http://oe.handhelds.org/not_there.aac" SRC_URI =3D "ftp://oe.handhelds.org/not_there_as_well.aac" SRC_URI =3D "ftp://you@oe.handhelds.org/home/you/secret.plan" + SRC_URI =3D "https://oe.handhelds.org/not_there.aac;user_agent=3D1" =20 .. note:: =20 diff --git a/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rs= t b/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst index 899e584f9..3f310bd72 100644 --- a/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst +++ b/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst @@ -699,6 +699,11 @@ overview of their function and contents. Within an executing task, this variable holds the hash of the task= as returned by the currently enabled signature generator. =20 + :term:`BB_FETCH_USER_AGENT` + Specifies a user-agent string which BitBake uses if "user_agent" + parameter is enabled for HTTP/FTP wget fetcher. Default value can + be found in ``conf/bitbake.conf``. + :term:`BB_VERBOSE_LOGS` Controls how verbose BitBake is during builds. If set, shell scrip= ts echo commands and shell script output appears on standard out diff --git a/lib/bb/fetch2/wget.py b/lib/bb/fetch2/wget.py index 2e9211763..30a56df71 100644 --- a/lib/bb/fetch2/wget.py +++ b/lib/bb/fetch2/wget.py @@ -52,11 +52,8 @@ class WgetProgressHandler(bb.progress.LineFilterProgre= ssHandler): =20 class Wget(FetchMethod): """Class to fetch urls via 'wget'""" - - # CDNs like CloudFlare may do a 'browser integrity test' which can f= ail - # with the standard wget/urllib User-Agent, so pretend to be a moder= n - # browser. - user_agent =3D "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gec= ko/20100101 Firefox/84.0" + def init(self, d):