Message ID | 20241014133504.2390211-1-alex.kanavin@gmail.com |
---|---|
State | New |
Headers | show |
Series | [RFC] fetch2/wget: set User-Agent to 'bitbake/version' in checkstatus() | expand |
Initial testing behind the TI firewall showed no issues with these two changes. It *should* be ok to move ahead with it. On 10/14/2024 8:35 AM, Alexander Kanavin wrote: > From: Alexander Kanavin <alex@linutronix.de> > > This eliminates the last usage of 'fake mozilla' in bitbake, and > it's then truthful everywhere about presenting itself, or wget > (when that is used). > > I understand this will make people nervous so I want to provide > an extended decription. > > 1. How was this tested? > > - bitbake-selftest -k FetchCheckStatusTest > (tests a few hardcoded URIs, all passed) > > - bitbake -k -c checkuri world > (runs checkstatus() over all recipes in oe-core, and all passed again - > this hopefully goes a long way to reassure everyone that hosts around > the world and various CDNs typically do not have a problem with user-agent > strings they haven't seen before or bitbake user-agent specifically) > > 2. What about that removed cloudflare comment? > > I digged into git history, and I think it is not fully accurate. First, 'fake > mozilla' agent is used only for checkstatus() - in actual fetching with wget > it is not. And that has not been a problem for anyone. > > Second, here's how the comment occured. Usage of 'fake mozilla' was introduced here: > https://git.yoctoproject.org/poky/commit/?h=master&id=ab26fdae9e5ae56bb84196698d3fa4fd568fe903 > > At that point it did not have to be specifically 'mozilla', the commit message > indicates that any User-Agent would have been ok. Mozilla was simply copied > from upstream version check for convenience. > > Later on, the string was updated to a more recent Mozilla: > https://git.yoctoproject.org/poky/commit/?h=master&id=9f123238261a68e37cec634782e9320633cac5d4 > > The claim in the added comment become something else: that User-Agent *must* a browser, > without evidence or tests. Even though it demonstrably doesn't have to be - wget is ok. > > 3. What if someone has a server that is ok with wget agent, but not ok with bitbake agent? > > Please see point one. It's not impossible but I think it's highly unlikely. I do think > we should rather tell servers the truth, and learn where the actual issues are. Then > we can consider options - whether that would be pretending to be wget, or allowing user-agent > to be configured. We should also add such servers to bitbake-selftest so we know what they > are. > > Signed-off-by: Alexander Kanavin <alex@linutronix.de> > --- > bitbake/lib/bb/fetch2/wget.py | 7 +------ > 1 file changed, 1 insertion(+), 6 deletions(-) > > diff --git a/bitbake/lib/bb/fetch2/wget.py b/bitbake/lib/bb/fetch2/wget.py > index 493a5b62ee2..7856d10fa4b 100644 > --- a/bitbake/lib/bb/fetch2/wget.py > +++ b/bitbake/lib/bb/fetch2/wget.py > @@ -53,11 +53,6 @@ class WgetProgressHandler(bb.progress.LineFilterProgressHandler): > class Wget(FetchMethod): > """Class to fetch urls via 'wget'""" > > - # CDNs like CloudFlare may do a 'browser integrity test' which can fail > - # with the standard wget/urllib User-Agent, so pretend to be a modern > - # browser. > - user_agent = "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" > - > def check_certs(self, d): > """ > Should certificates be checked? > @@ -356,7 +351,7 @@ class Wget(FetchMethod): > # Some servers (FusionForge, as used on Alioth) require that the > # optional Accept header is set. > r.add_header("Accept", "*/*") > - r.add_header("User-Agent", self.user_agent) > + r.add_header("User-Agent", "bitbake/{}".format(bb.__version__)) > def add_basic_auth(login_str, request): > '''Adds Basic auth to http request, pass in login:password as string''' > import base64 > > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#16680): https://lists.openembedded.org/g/bitbake-devel/message/16680 > Mute This Topic: https://lists.openembedded.org/mt/109001174/6551054 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [reatmon@ti.com] > -=-=-=-=-=-=-=-=-=-=-=- >
diff --git a/bitbake/lib/bb/fetch2/wget.py b/bitbake/lib/bb/fetch2/wget.py index 493a5b62ee2..7856d10fa4b 100644 --- a/bitbake/lib/bb/fetch2/wget.py +++ b/bitbake/lib/bb/fetch2/wget.py @@ -53,11 +53,6 @@ class WgetProgressHandler(bb.progress.LineFilterProgressHandler): class Wget(FetchMethod): """Class to fetch urls via 'wget'""" - # CDNs like CloudFlare may do a 'browser integrity test' which can fail - # with the standard wget/urllib User-Agent, so pretend to be a modern - # browser. - user_agent = "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" - def check_certs(self, d): """ Should certificates be checked? @@ -356,7 +351,7 @@ class Wget(FetchMethod): # Some servers (FusionForge, as used on Alioth) require that the # optional Accept header is set. r.add_header("Accept", "*/*") - r.add_header("User-Agent", self.user_agent) + r.add_header("User-Agent", "bitbake/{}".format(bb.__version__)) def add_basic_auth(login_str, request): '''Adds Basic auth to http request, pass in login:password as string''' import base64