| Message ID | 20250205071538.2681-1-stefan.herbrechtsmeier-oss@weidmueller.com |
|---|---|
| Headers | show |
| Series | Make mirror replacement syntax explicit | expand |
On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: > The mirror replacement syntax contains many implicit transformations. > The path of the URI always contains the base name of the downloaded > filename. This makes it impossible to rename or remove the base name of > the original path. It prevents upstream mirror for SRC_URIS with a > downloadfilename parameter. The base name of the downloaded filename > makes it impossible to use the download mirror for SRC_URIs with > subfolders in the downloadfilename parameter. Altogether the implicit > transformation complicates the understanding of the replacements. > > This series adds an additional replacement named DOWNLOADFILENAME. This > replacement contains the relative filename of the downloaded file or > mirror archive for git and hg. This allows the user to explicitly define > the behavior. The usage is equivalent to the PATH replacement for the > sstate mirror from file to https scheme. > > git://.*/.* http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME > https?://.*/.* http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME > file://.* https://sstate.yoctoproject.org/all/PATH;downloadfilename=PATH > > Without a replacement variable the mirror will use the same base name as > the origin SRC_URI. This allows the usage of private package manager > registry together with a downloadfilename parameter or the rename of the > base name. > > https://registry.npmjs.org/ https://example.com/npm/registry/ > https://example.com/example/1.0.0.tgz https://example.com/example/example-1.0.0.tgz > > The series adds heuristics to keep a backward compatibility to common > styles. Because of the ambiguity of the old style, it is advisable to > remove this compatibility sooner or later to avoid unexpected behavior. Thanks for the patches, these look interesting with some good improvements in there. A lot of the series looks like cleanups and those look like good fixes to have. It may make sense to split this series into two, the cleanups/fixes and the behaviour changes. I'm not entirely "sold" on the naming of DOWNLOADFILENAME. You have to think about this from the perspective of someone writing a MIRROR or PREMIRROR entry - would they understand what that means vs some of the other names? I'm also a bit nervous about breaking compatibility with the older syntaxes. It is unclear to me how or when we'd detect/deprecate older formats. I do agree we probably do need to remove some support for some syntax and move to something new though as what we have is turning into some kind of nightmare. FWIW this is really old code that in many ways predates my involvement so probably 20+ years old. I'll continue to give this some thought. If you could confirm which patches are "cleanup" that would help though, we can see if we can get those in a bit faster. Cheers, Richard
Am 05.02.2025 um 11:34 schrieb Richard Purdie: > On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: >> The mirror replacement syntax contains many implicit transformations. >> The path of the URI always contains the base name of the downloaded >> filename. This makes it impossible to rename or remove the base name of >> the original path. It prevents upstream mirror for SRC_URIS with a >> downloadfilename parameter. The base name of the downloaded filename >> makes it impossible to use the download mirror for SRC_URIs with >> subfolders in the downloadfilename parameter. Altogether the implicit >> transformation complicates the understanding of the replacements. >> >> This series adds an additional replacement named DOWNLOADFILENAME. This >> replacement contains the relative filename of the downloaded file or >> mirror archive for git and hg. This allows the user to explicitly define >> the behavior. The usage is equivalent to the PATH replacement for the >> sstate mirror from file to https scheme. >> >> git://.*/.*http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME >> https?://.*/.*http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME >> file://.* https://sstate.yoctoproject.org/all/PATH;downloadfilename=PATH >> >> Without a replacement variable the mirror will use the same base name as >> the origin SRC_URI. This allows the usage of private package manager >> registry together with a downloadfilename parameter or the rename of the >> base name. >> >> https://registry.npmjs.org/ https://example.com/npm/registry/ >> https://example.com/example/1.0.0.tgz https://example.com/example/example-1.0.0.tgz >> >> The series adds heuristics to keep a backward compatibility to common >> styles. Because of the ambiguity of the old style, it is advisable to >> remove this compatibility sooner or later to avoid unexpected behavior. > Thanks for the patches, these look interesting with some good > improvements in there. A lot of the series looks like cleanups and > those look like good fixes to have. It may make sense to split this > series into two, the cleanups/fixes and the behaviour changes. > > I'm not entirely "sold" on the naming of DOWNLOADFILENAME. You have to > think about this from the perspective of someone writing a MIRROR or > PREMIRROR entry - would they understand what that means vs some of the > other names? I’m open for suggestions. Even ARCHIVE or TARBALL are hard to understand because it is only a relative path on the download mirror. Alternative we can mark the lines as upstream or download mirror and give the replacement different meanings. The path could be the original PATH for an upstream mirror or the relative path of the downloaded file for the download mirror. > I'm also a bit nervous about breaking compatibility with the older > syntaxes. It is unclear to me how or when we'd detect/deprecate older > formats. I do agree we probably do need to remove some support for some > syntax and move to something new though as what we have is turning into > some kind of nightmare. FWIW this is really old code that in many ways > predates my involvement so probably 20+ years old. The only possibility is to mark the new format because it is impossible to know if a replacement want to keep the base name of the path. As example the following replacement is used in the test: file:///some1where/.* file://some2where/ The plain regular expressions replaces the complete path with a new path but the old code appends only the base name of the path. What was the desired behavior? file:///some1where/ file://some2where/ file:///some1where/.* file://some2where/BASENAME_OF_DOWNLOADFILENAME The main problem is that we have to keep the bugs if we keep the old behavior. > I'll continue to give this some thought. If you could confirm which > patches are "cleanup" that would help though, we can see if we can get > those in a bit faster. The first 8 patches are cleanups.
Am 05.02.2025 um 13:12 schrieb Stefan Herbrechtsmeier via lists.openembedded.org: > Am 05.02.2025 um 11:34 schrieb Richard Purdie: >> On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: >>> The mirror replacement syntax contains many implicit transformations. >>> The path of the URI always contains the base name of the downloaded >>> filename. This makes it impossible to rename or remove the base name of >>> the original path. It prevents upstream mirror for SRC_URIS with a >>> downloadfilename parameter. The base name of the downloaded filename >>> makes it impossible to use the download mirror for SRC_URIs with >>> subfolders in the downloadfilename parameter. Altogether the implicit >>> transformation complicates the understanding of the replacements. >>> >>> This series adds an additional replacement named DOWNLOADFILENAME. This >>> replacement contains the relative filename of the downloaded file or >>> mirror archive for git and hg. This allows the user to explicitly define >>> the behavior. The usage is equivalent to the PATH replacement for the >>> sstate mirror from file to https scheme. >>> >>> git://.*/.*http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME >>> https?://.*/.*http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME >>> file://.* https://sstate.yoctoproject.org/all/PATH;downloadfilename=PATH >>> >>> Without a replacement variable the mirror will use the same base name as >>> the origin SRC_URI. This allows the usage of private package manager >>> registry together with a downloadfilename parameter or the rename of the >>> base name. >>> >>> https://registry.npmjs.org/ https://example.com/npm/registry/ >>> https://example.com/example/1.0.0.tgz https://example.com/example/example-1.0.0.tgz >>> >>> The series adds heuristics to keep a backward compatibility to common >>> styles. Because of the ambiguity of the old style, it is advisable to >>> remove this compatibility sooner or later to avoid unexpected behavior. >> Thanks for the patches, these look interesting with some good >> improvements in there. A lot of the series looks like cleanups and >> those look like good fixes to have. It may make sense to split this >> series into two, the cleanups/fixes and the behaviour changes. >> >> I'm not entirely "sold" on the naming of DOWNLOADFILENAME. You have to >> think about this from the perspective of someone writing a MIRROR or >> PREMIRROR entry - would they understand what that means vs some of the >> other names? > > I’m open for suggestions. Even ARCHIVE or TARBALL are hard to > understand because it is only a relative path on the download mirror. > Alternative we can mark the lines as upstream or download mirror and > give the replacement different meanings. The path could be the > original PATH for an upstream mirror or the relative path of the > downloaded file for the download mirror. > >> I'm also a bit nervous about breaking compatibility with the older >> syntaxes. It is unclear to me how or when we'd detect/deprecate older >> formats. I do agree we probably do need to remove some support for some >> syntax and move to something new though as what we have is turning into >> some kind of nightmare. FWIW this is really old code that in many ways >> predates my involvement so probably 20+ years old. > > The only possibility is to mark the new format because it is > impossible to know if a replacement want to keep the base name of the > path. > > As example the following replacement is used in the test: > > file:///some1where/.* file://some2where/ > > The plain regular expressions replaces the complete path with a new > path but the old code appends only the base name of the path. What was > the desired behavior? > > file:///some1where/ file://some2where/ > file:///some1where/.* file://some2where/BASENAME_OF_DOWNLOADFILENAME > The BASENAME_OF_DOWNLOADFILENAME is wrong because the original is a local path and not a downloaded file: file:///some1where/.* file://some2where/BASENAME > The main problem is that we have to keep the bugs if we keep the old > behavior. > >> I'll continue to give this some thought. If you could confirm which >> patches are "cleanup" that would help though, we can see if we can get >> those in a bit faster. > > The first 8 patches are cleanups. >
On Wed, 2025-02-05 at 13:12 +0100, Stefan Herbrechtsmeier wrote: > Am 05.02.2025 um 11:34 schrieb Richard Purdie: > On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: > > > The mirror replacement syntax contains many implicit transformations. > > > The path of the URI always contains the base name of the downloaded > > > filename. This makes it impossible to rename or remove the base name of > > > the original path. It prevents upstream mirror for SRC_URIS with a > > > downloadfilename parameter. The base name of the downloaded filename > > > makes it impossible to use the download mirror for SRC_URIs with > > > subfolders in the downloadfilename parameter. Altogether the implicit > > > transformation complicates the understanding of the replacements. > > > > > > This series adds an additional replacement named DOWNLOADFILENAME. This > > > replacement contains the relative filename of the downloaded file or > > > mirror archive for git and hg. This allows the user to explicitly define > > > the behavior. The usage is equivalent to the PATH replacement for the > > > sstate mirror from file to https scheme. > > > > > > git://.*/.* http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME > > > https?://.*/.* http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME > > > file://.* https://sstate.yoctoproject.org/all/PATH;downloadfilename=PATH > > > > > > Without a replacement variable the mirror will use the same base name as > > > the origin SRC_URI. This allows the usage of private package manager > > > registry together with a downloadfilename parameter or the rename of the > > > base name. > > > > > > https://registry.npmjs.org/ https://example.com/npm/registry/ > > > https://example.com/example/1.0.0.tgz https://example.com/example/example-1.0.0.tgz > > > > > > The series adds heuristics to keep a backward compatibility to common > > > styles. Because of the ambiguity of the old style, it is advisable to > > > remove this compatibility sooner or later to avoid unexpected behavior. > > > > > Thanks for the patches, these look interesting with some good > > improvements in there. A lot of the series looks like cleanups and > > those look like good fixes to have. It may make sense to split this > > series into two, the cleanups/fixes and the behaviour changes. > > > > I'm not entirely "sold" on the naming of DOWNLOADFILENAME. You have to > > think about this from the perspective of someone writing a MIRROR or > > PREMIRROR entry - would they understand what that means vs some of the > > other names? > > > I’m open for suggestions. Even ARCHIVE or TARBALL are hard to > understand because it is only a relative path on the download mirror. > Alternative we can mark the lines as upstream or download mirror and > give the replacement different meanings. The path could be the > original PATH for an upstream mirror or the relative path of the > downloaded file for the download mirror. I've been giving this topic some thought. One idea I wondered about was to instead markup the mirror urls with how they're expected to work with a new parameter. For example: git://.*/.* http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball The possible options would be something like: mirrortarball - mirror tarballs taken from DL_DIR flattened - copy of DL_DIR so DL_DIR layout (maybe call it dldir?) upstream - layout is the same as the upstream directory structure so a direct url replacement If using a mirrortarball mirror url, we'd know to use the values from urldata.mirrortarballs. We could add parameters to the fetcher to have two parameters, one will be the DL_DIR path and the other would be the upstream url path. One key question I have is how we might need to shorted the url path for some mirror urls to add/remove a path prefix in the mirroring. I think that would cover most of our scenarios and make the mirror urls much more useful/explicit. It also would give us a migration path as the code can simply error if it sees a mirror url without a mirrorformat parameter. This may mean we could have a clean slate for the mirror urls and switch to one clear format. I think we can make a case for such a breaking architecture change. To move it forward we'd need to work out which pieces of mirror syntax we should drop, which mirror replacement strings we need (I'm not convinced the current ones are right/useful). We'd probably need a some proof of concept patches showing it in action and a proposal to the oe- arch list explaining why the change was needed and what the change would do/look like. I think this can be done independently of the other fetcher/vendor changing but should be able to work for the vendoring issues too. Thoughts? I'm sure I'm missing things here :/. Cheers, Richard
Am 20.02.2025 um 11:22 schrieb Richard Purdie via lists.openembedded.org: > On Wed, 2025-02-05 at 13:12 +0100, Stefan Herbrechtsmeier wrote: >> Am 05.02.2025 um 11:34 schrieb Richard Purdie: >> On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: >>>> The mirror replacement syntax contains many implicit transformations. >>>> The path of the URI always contains the base name of the downloaded >>>> filename. This makes it impossible to rename or remove the base name of >>>> the original path. It prevents upstream mirror for SRC_URIS with a >>>> downloadfilename parameter. The base name of the downloaded filename >>>> makes it impossible to use the download mirror for SRC_URIs with >>>> subfolders in the downloadfilename parameter. Altogether the implicit >>>> transformation complicates the understanding of the replacements. >>>> >>>> This series adds an additional replacement named DOWNLOADFILENAME. This >>>> replacement contains the relative filename of the downloaded file or >>>> mirror archive for git and hg. This allows the user to explicitly define >>>> the behavior. The usage is equivalent to the PATH replacement for the >>>> sstate mirror from file to https scheme. >>>> >>>> git://.*/.*http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME >>>> https?://.*/.*http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME >>>> file://.* https://sstate.yoctoproject.org/all/PATH;downloadfilename=PATH >>>> >>>> Without a replacement variable the mirror will use the same base name as >>>> the origin SRC_URI. This allows the usage of private package manager >>>> registry together with a downloadfilename parameter or the rename of the >>>> base name. >>>> >>>> https://registry.npmjs.org/ https://example.com/npm/registry/ >>>> https://example.com/example/1.0.0.tgz https://example.com/example/example-1.0.0.tgz >>>> >>>> The series adds heuristics to keep a backward compatibility to common >>>> styles. Because of the ambiguity of the old style, it is advisable to >>>> remove this compatibility sooner or later to avoid unexpected behavior. >>>> >>> Thanks for the patches, these look interesting with some good >>> improvements in there. A lot of the series looks like cleanups and >>> those look like good fixes to have. It may make sense to split this >>> series into two, the cleanups/fixes and the behaviour changes. >>> >>> I'm not entirely "sold" on the naming of DOWNLOADFILENAME. You have to >>> think about this from the perspective of someone writing a MIRROR or >>> PREMIRROR entry - would they understand what that means vs some of the >>> other names? >>> >> I’m open for suggestions. Even ARCHIVE or TARBALL are hard to >> understand because it is only a relative path on the download mirror. >> Alternative we can mark the lines as upstream or download mirror and >> give the replacement different meanings. The path could be the >> original PATH for an upstream mirror or the relative path of the >> downloaded file for the download mirror. > I've been giving this topic some thought. One idea I wondered about was > to instead markup the mirror urls with how they're expected to work > with a new parameter. For example: > > git://.*/.*http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball The ? could be problematic because it is the separator for the query. It is unlikely that the user really use this query parameter but it could complicate the code because we have to handle additional query parameters. What does the "?mirrorformat=mirrortarball" mean? Will it work like a MIRRORTARBALL replacement? How does a simple replacement should look like? http:// https:// Because of the backward compatible this will replace the basename of the path. > The possible options would be something like: > > mirrortarball - mirror tarballs taken from DL_DIR > flattened - copy of DL_DIR so DL_DIR layout (maybe call it dldir?) > upstream - layout is the same as the upstream directory structure so a direct url replacement Do you think we have to handle the mirror tarball explicit? The mirror tarball is required for a scheme change. > If using a mirrortarball mirror url, we'd know to use the values from > urldata.mirrortarballs. We could add parameters to the fetcher to have > two parameters, one will be the DL_DIR path and the other would be the > upstream url path. I don't understand where this is needed, because the mirror tarball and downloadfilename are used by different fetchers. > One key question I have is how we might need to > shorted the url path for some mirror urls to add/remove a path prefix > in the mirroring. What do you mean by this? The downloadfilename could contain a path without any problem after my change. > I think that would cover most of our scenarios and make the mirror urls > much more useful/explicit. > > It also would give us a migration path as the code can simply error if > it sees a mirror url without a mirrorformat parameter. This may mean we > could have a clean slate for the mirror urls and switch to one clear > format. I think we can make a case for such a breaking architecture > change. Does this means we doesn't need a backward compatibility? Alternative we could make some replacement mandatory if a wildcard is used to detect obsolete entries. > To move it forward we'd need to work out which pieces of mirror syntax > we should drop, which mirror replacement strings we need (I'm not > convinced the current ones are right/useful). We'd probably need a some > proof of concept patches showing it in action and a proposal to the oe- > arch list explaining why the change was needed and what the change > would do/look like. > > I think this can be done independently of the other fetcher/vendor > changing but should be able to work for the vendoring issues too. Sure. I can rework my patch as soon as we have an agreement.
On Thu, 2025-02-20 at 12:45 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: > > Am 20.02.2025 um 11:22 schrieb Richard Purdie via > lists.openembedded.org: > > On Wed, 2025-02-05 at 13:12 +0100, Stefan Herbrechtsmeier wrote: > > > Am 05.02.2025 um 11:34 schrieb Richard Purdie: > > > On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: > > > I’m open for suggestions. Even ARCHIVE or TARBALL are hard to > > > understand because it is only a relative path on the download mirror. > > > Alternative we can mark the lines as upstream or download mirror and > > > give the replacement different meanings. The path could be the > > > original PATH for an upstream mirror or the relative path of the > > > downloaded file for the download mirror. > > > > > > > I've been giving this topic some thought. One idea I wondered about was > > to instead markup the mirror urls with how they're expected to work > > with a new parameter. For example: > > > > git://.*/.* http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball > > > The ? could be problematic because it is the separator for the query. > It is unlikely that the user really use this query parameter but it > could complicate the code because we have to handle additional query > parameters. > What does the "?mirrorformat=mirrortarball" mean? Will it work like a > MIRRORTARBALL replacement? The mirrorformat parameter would be used by the mirroring code itself to understand how to handle the url. It would be dropped from the modified url so is only therefore our code's use. If there are additional parameters they would be passed through as they are now. > How does a simple replacement should look like? > > http:// https:// > > Because of the backward compatible this will replace the basename of > the path. It would depend how the mirror is laid out. Some mirrors flatten the urls like DL_DIR is laid out, some potentially don't. The standard usage would likely have a mirrorformat=dldir parameter added. > > The possible options would be something like: > > > > mirrortarball - mirror tarballs taken from DL_DIR > > flattened - copy of DL_DIR so DL_DIR layout (maybe call it dldir?) > > upstream - layout is the same as the upstream directory structure so a direct url replacement > > > Do you think we have to handle the mirror tarball explicit? The > mirror tarball is required for a scheme change. If we do that, we can avoid having to guess at too many urls to test to figure out a mirror format so I think it would be an improvement on where we are today. > > If using a mirrortarball mirror url, we'd know to use the values from > > urldata.mirrortarballs. We could add parameters to the fetcher to have > > two parameters, one will be the DL_DIR path and the other would be the > > upstream url path. > > > I don't understand where this is needed, because the mirror tarball > and downloadfilename are used by different fetchers. Please keep in mind that downloadfilename is pretty much a misfeature. It was added as we couldn't control collisions inside dl_dir but it creates all kind of other problems. I think we do need to handle that problem case but it does then mean we have to indicate whether any given mirror uses "dldir" or "upstream" names and paths. > > One key question I have is how we might need to > > shorted the url path for some mirror urls to add/remove a path prefix > > in the mirroring. > > > What do you mean by this? The downloadfilename could contain a path > without any problem after my change. See above, downloadfilename is not something I'm keen to promote and is creating several of the problems we have by badly trying to hack extra functionality onto the fetcher without thinking through all the issues like mirroring. This is about the fact that you could have: htttps://some.server/some/deep/multi/level/path/ and a mirror of things there at: htttps://some.server/shortpath/path/ so the mirror urls need to be able to map htttps://some.server/some/deep/multi/level/path/a/b/c.tgz to htttps://some.server/shortpath/path/a/b/c.tgz but also: htttps://some.server/shortpath/path/c.tgz or possibly htttps://some.server/shortpath/path.d.tgz if downloadfilename is in action. > > > > I think that would cover most of our scenarios and make the mirror urls > > much more useful/explicit. > > > > It also would give us a migration path as the code can simply error if > > it sees a mirror url without a mirrorformat parameter. This may mean we > > could have a clean slate for the mirror urls and switch to one clear > > format. I think we can make a case for such a breaking architecture > > change. > > > Does this means we doesn't need a backward compatibility? If we can come up with sufficient justification to break things and can detect and error correctly for old usage and get this signed off by the TSC/community, potentially, yes. > Alternative we could make some replacement mandatory if a wildcard is > used to detect obsolete entries. I don't understand that. I do think we have too many problems in the existing mirroring url mapping and we probably need to rework this rather than try and pile more patches into it and complicate it further. The question is whether the proposal fixes the issues it needs to and has enough simplification and benefit to justify making the change. Cheers, Richard
Am 20.02.2025 um 13:21 schrieb Richard Purdie: > On Thu, 2025-02-20 at 12:45 +0100, Stefan Herbrechtsmeier via > lists.openembedded.org wrote: >> Am 20.02.2025 um 11:22 schrieb Richard Purdie via lists.openembedded.org: >>> On Wed, 2025-02-05 at 13:12 +0100, Stefan Herbrechtsmeier wrote: >>>> Am 05.02.2025 um 11:34 schrieb Richard Purdie: >>>> On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: >>>> I’m open for suggestions. Even ARCHIVE or TARBALL are hard to >>>> understand because it is only a relative path on the download mirror. >>>> Alternative we can mark the lines as upstream or download mirror and >>>> give the replacement different meanings. The path could be the >>>> original PATH for an upstream mirror or the relative path of the >>>> downloaded file for the download mirror. >>> I've been giving this topic some thought. One idea I wondered about was >>> to instead markup the mirror urls with how they're expected to work >>> with a new parameter. For example: >>> git://.*/.*http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball >> The ? could be problematic because it is the separator for the query. >> It is unlikely that the user really use this query parameter but it >> could complicate the code because we have to handle additional query >> parameters. >> What does the "?mirrorformat=mirrortarball" mean? Will it work like a >> MIRRORTARBALL replacement? > > The mirrorformat parameter would be used by the mirroring code itself > to understand how to handle the url. What is the different to a MIRRORTARBALL replacement? The code will replace the word with the content. > It would be dropped from the modified url so is only therefore our > code's use. If there are additional parameters they would be passed > through as they are now. > >> How does a simple replacement should look like? >> >> http:// https:// >> >> Because of the backward compatible this will replace the basename of >> the path. > > It would depend how the mirror is laid out. Some mirrors flatten the > urls like DL_DIR is laid out, some potentially don't. The standard > usage would likely have a mirrorformat=dldir parameter added. How does the user specify an entry that replace the http scheme with https and keeps everything else like it is (upstream mirror)? >>> The possible options would be something like: >>> mirrortarball - mirror tarballs taken from DL_DIR >>> flattened - copy of DL_DIR so DL_DIR layout (maybe call it dldir?) >>> upstream - layout is the same as the upstream directory structure so a direct url replacement >> Do you think we have to handle the mirror tarball explicit? The >> mirror tarball is required for a scheme change. > > If we do that, we can avoid having to guess at too many urls to test > to figure out a mirror format so I think it would be an improvement on > where we are today. Do you mean we will test if the URL have a mirrortarball and if not skip the entry? > >>> If using a mirrortarball mirror url, we'd know to use the values from >>> urldata.mirrortarballs. We could add parameters to the fetcher to have >>> two parameters, one will be the DL_DIR path and the other would be the >>> upstream url path. >> I don't understand where this is needed, because the mirror tarball >> and downloadfilename are used by different fetchers. > > Please keep in mind that downloadfilename is pretty much a misfeature. > It was added as we couldn't control collisions inside dl_dir but it > creates all kind of other problems. I think we do need to handle that > problem case but it does then mean we have to indicate whether any > given mirror uses "dldir" or "upstream" names and paths. I don't understand the problem. The download mirror will use the downloadfilename or its default the basename of the localpath. The upstream mirror will use the path. The mirrortarball will use the mirrortarball. Why the fetcher need two parameters? >>> One key question I have is how we might need to >>> shorted the url path for some mirror urls to add/remove a path prefix >>> in the mirroring. >> What do you mean by this? The downloadfilename could contain a path >> without any problem after my change. > > See above, downloadfilename is not something I'm keen to promote and > is creating several of the problems we have by badly trying to hack > extra functionality onto the fetcher without thinking through all the > issues like mirroring. What is the desired way to avoid name clashes? The package manager fetcher need an generic way to override the basename. > This is about the fact that you could have: > > htttps://some.server/some/deep/multi/level/path/ > > and a mirror of things there at: > > htttps://some.server/shortpath/path/ > > so the mirror urls need to be able to map > > htttps://some.server/some/deep/multi/level/path/a/b/c.tgz This could be solved if we use my proposed code. > to > > htttps://some.server/shortpath/path/a/b/c.tgz htttps://some.server/some/deep/multi/level/path/ htttps://some.server/shortpath/path/ > but also: > > htttps://some.server/shortpath/path/c.tgz htttps://some.server/some/deep/multi/level/path/.* htttps://some.server/shortpath/path/DOWNLOADFILENAME > or possibly > > htttps://some.server/shortpath/path.d.tgz > if downloadfilename is in action. Because the fallback for the downloadfilename is the basename the same entry works. htttps://some.server/some/deep/multi/level/path/.* htttps://some.server/shortpath/DOWNLOADFILENAME >>> I think that would cover most of our scenarios and make the mirror urls >>> much more useful/explicit. >>> It also would give us a migration path as the code can simply error if >>> it sees a mirror url without a mirrorformat parameter. This may mean we >>> could have a clean slate for the mirror urls and switch to one clear >>> format. I think we can make a case for such a breaking architecture >>> change. >> Does this means we doesn't need a backward compatibility? > > If we can come up with sufficient justification to break things and > can detect and error correctly for old usage and get this signed off > by the TSC/community, potentially, yes. Okay >> Alternative we could make some replacement mandatory if a wildcard is >> used to detect obsolete entries. > > I don't understand that. If we have a wildcard .* in the path we need to know how to replace it. This could be the PATH, BASENAME, DOWNLOADFILENAME, MIRRORARCHIVE or re group. But in case of the re group this could still be an old entry. Do support all cases we need a fix prefix or delimiter: r:http https http#https http|https http?://.*/.*|http://downloads.yoctoproject.org/mirror/sources/DOWNLOADARCHIVE git://.*/.*|http://downloads.yoctoproject.org/mirror/sources/MIRRORARCHIVE > I do think we have too many problems in the existing mirroring url > mapping and we probably need to rework this rather than try and pile > more patches into it and complicate it further. I have already rework it. If I can remove the backward compatibility and replace it with an error this would simplify the code. I only need a better name for the DOWNLOADFILENAME (DOWNLOADARCHIVE) and add the MIRRORARCHIV. > The question is whether the proposal fixes the issues it needs to and > has enough simplification and benefit to justify making the change. My patches support folders in the downloadfilename, upstream mirrors and renames. I have to rework the MIRROR strings but therefore the commented out tests work.
On Thu, 2025-02-20 at 18:37 +0100, Stefan Herbrechtsmeier wrote: > > Am 20.02.2025 um 13:21 schrieb Richard Purdie: > > On Thu, 2025-02-20 at 12:45 +0100, Stefan Herbrechtsmeier via > > lists.openembedded.org wrote: > > > Am 20.02.2025 um 11:22 schrieb Richard Purdie via > > > lists.openembedded.org: > > > > On Wed, 2025-02-05 at 13:12 +0100, Stefan Herbrechtsmeier wrote: > > > > > Am 05.02.2025 um 11:34 schrieb Richard Purdie: > > > > > > > > > > On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: > > > > > > > > > > I’m open for suggestions. Even ARCHIVE or TARBALL are hard to > > > > > > > > > > understand because it is only a relative path on the download mirror. > > > > > > > > > > Alternative we can mark the lines as upstream or download mirror and > > > > > > > > > > give the replacement different meanings. The path could be the > > > > > > > > > > original PATH for an upstream mirror or the relative path of the > > > > > > > > > > downloaded file for the download mirror. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've been giving this topic some thought. One idea I wondered about was > > > > > > > > to instead markup the mirror urls with how they're expected to work > > > > > > > > with a new parameter. For example: > > > > > > > > git://.*/.* http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball > > > > > > > > > > > > > > > > > > The ? could be problematic because it is the separator for the > > > query. It is unlikely that the user really use this query > > > parameter but it could complicate the code because we have to > > > handle additional query parameters. > > > What does the "?mirrorformat=mirrortarball" mean? Will it work > > > like a MIRRORTARBALL replacement? > > > > > > > > > > > > > The mirrorformat parameter would be used by the mirroring code > > itself to understand how to handle the url. > > > > What is the different to a MIRRORTARBALL replacement? The code will > replace the word with the content. Think about this from a usability perspective. We're struggling to even work out good names for your proposal. Even if we work out the names, I still don't think users are going to understand how to convert urls into the new syntax. The difference with my proposed format is that we're specifying it in a way which I suspect users will better understand without needing to go and read the docs every time. We're saying what we're configuring with the "mirrorformat" key and then the value should be able to describe the format. My proposal also gives us both a way to clearly detect when obsolete formatting is used and a namespace mechanism to extend, with both being in a way we can easily and clearly describe in the docs. I appreciate with your proposal we can add more strings and we can add docs about how to migrate but I suspect users aren't going to be as readily/easily able to understand it. > > It would be dropped from the modified url so is only therefore our > > code's use. If there are additional parameters they would be passed > > through as they are now. > > > > > > > > > > > > > > How does a simple replacement should look like? > > > > > > http:// https:// > > > > > > Because of the backward compatible this will replace the basename > > > of the path. > > > > > > > > > > > > > It would depend how the mirror is laid out. Some mirrors flatten > > the urls like DL_DIR is laid out, some potentially don't. The > > standard usage would likely have a mirrorformat=dldir parameter > > added. > > > > How does the user specify an entry that replace the http scheme with > https and keeps everything else like it is (upstream mirror)? http://.*/.* https://.*/.*?mirrorformat=upstream We need to determine the best value for "upstream". I'd also like to review whether the .* formatting is the best way to handle this if we are going to change the format. > > > > The possible options would be something like: > > > > > > > > mirrortarball - mirror tarballs taken from DL_DIR > > > > > > > > flattened - copy of DL_DIR so DL_DIR layout (maybe call it dldir?) > > > > > > > > upstream - layout is the same as the upstream directory structure so a direct url replacement > > > > > > > > > > > > > > > > > > Do you think we have to handle the mirror tarball explicit? The > > > mirror tarball is required for a scheme change. > > > > > > > > > > > > > If we do that, we can avoid having to guess at too many urls to > > test to figure out a mirror format so I think it would be an > > improvement on where we are today. > > > > Do you mean we will test if the URL have a mirrortarball and if not > skip the entry? Correct. > > > > If using a mirrortarball mirror url, we'd know to use the values from > > > > > > > > urldata.mirrortarballs. We could add parameters to the fetcher to have > > > > > > > > two parameters, one will be the DL_DIR path and the other would be the > > > > > > > > upstream url path. > > > > > > > > > > > > > > > > > > I don't understand where this is needed, because the mirror > > > tarball and downloadfilename are used by different fetchers. > > > > > > > > > > > > > Please keep in mind that downloadfilename is pretty much a > > misfeature. It was added as we couldn't control collisions inside > > dl_dir but it creates all kind of other problems. I think we do > > need to handle that problem case but it does then mean we have to > > indicate whether any given mirror uses "dldir" or "upstream" names > > and paths. > > > > I don't understand the problem. The download mirror will use the > downloadfilename or its default the basename of the localpath. The > upstream mirror will use the path. The mirrortarball will use the > mirrortarball. Why the fetcher need two parameters? You are trying to make downloadfilename a supported parameter of every fetcher. I'm arguing that I wish we'd never added it at all and that I'd rather not use it or encourage its use. I don't think you understand the way the fetcher API was written/used and this is why some of the patches are still on hold in master-next until I can convince myself they are actually the right thing to do. There have been too many other misunderstandings to give me confidence they're going to do the right thing :(. Sadly, I just don't have the time do the right level of review and everything else being asked of me. > > > > One key question I have is how we might need to > > > > > > > > shorted the url path for some mirror urls to add/remove a path prefix > > > > > > > > in the mirroring. > > > > > > > > > > > > > > > > > > What do you mean by this? The downloadfilename could contain a > > > path without any problem after my change. > > > > > > > > > > > > > See above, downloadfilename is not something I'm keen to promote > > and is creating several of the problems we have by badly trying to > > hack extra functionality onto the fetcher without thinking through > > all the issues like mirroring. > > > > What is the desired way to avoid name clashes? The package manager > fetcher need an generic way to override the basename. Why does it need that? Usually we've used the directory layout to avoid problems where we can for example. I've been hoping we could do similar here rather than use downloadfilename, which causes so many mirroring issues in the first place. > > > Alternative we could make some replacement mandatory if a > > > wildcard is used to detect obsolete entries. > > > > > > > > > > > > > I don't understand that. > > > > If we have a wildcard .* in the path we need to know how to replace > it. This could be the PATH, BASENAME, DOWNLOADFILENAME, MIRRORARCHIVE > or re group. But in case of the re group this could still be an old > entry. I'm not entirely sure we want to keep all the different syntax. One frustration I have with the current code is the way pattern matches are restricted to that url component for example and I've wondered if we could/should do something different instead, if we can make it simpler. > Do support all cases we need a fix prefix or delimiter: > > r:http https > > http#https > http|https > > http?://.*/.*| > http://downloads.yoctoproject.org/mirror/sources/DOWNLOADARCHIVE > git://.*/.*| > http://downloads.yoctoproject.org/mirror/sources/MIRRORARCHIVE > > > > > > I do think we have too many problems in the existing mirroring url > > mapping and we probably need to rework this rather than try and > > pile more patches into it and complicate it further. > > > I have already rework it. If I can remove the backward compatibility > and replace it with an error this would simplify the code. I only > need a better name for the DOWNLOADFILENAME (DOWNLOADARCHIVE) and add > the MIRRORARCHIV. > > > > > The question is whether the proposal fixes the issues it needs to > > and has enough simplification and benefit to justify making the > > change. > > > My patches support folders in the downloadfilename, upstream mirrors > and renames. I have to rework the MIRROR strings but therefore the > commented out tests work. > You've created a patch, yes. I don't think it improves usability though and I think you're also pushing concepts like downloadfilename into places we might not want to use them too. For me to merge patches like these, there needs to be a sense of trust and shared understanding. This simply isn't there, you're just saying your patches are fine as they are, I disagree. I therefore worry we're at an impasse and are going to struggle to move beyond this. That does make me quite sad. Regards, Richard
Am 20.02.2025 um 23:00 schrieb Richard Purdie: > On Thu, 2025-02-20 at 18:37 +0100, Stefan Herbrechtsmeier wrote: >> Am 20.02.2025 um 13:21 schrieb Richard Purdie: >>> On Thu, 2025-02-20 at 12:45 +0100, Stefan Herbrechtsmeier via >>> lists.openembedded.org wrote: >>>> Am 20.02.2025 um 11:22 schrieb Richard Purdie via >>>> lists.openembedded.org: >>>>> On Wed, 2025-02-05 at 13:12 +0100, Stefan Herbrechtsmeier wrote: >>>>>> Am 05.02.2025 um 11:34 schrieb Richard Purdie: >>>>>> On Wed, 2025-02-05 at 08:15 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote: >>>>>> I’m open for suggestions. Even ARCHIVE or TARBALL are hard to >>>>>> understand because it is only a relative path on the download mirror. >>>>>> Alternative we can mark the lines as upstream or download mirror and >>>>>> give the replacement different meanings. The path could be the >>>>>> original PATH for an upstream mirror or the relative path of the >>>>>> downloaded file for the download mirror. >>>>> I've been giving this topic some thought. One idea I wondered about was >>>>> to instead markup the mirror urls with how they're expected to work >>>>> with a new parameter. For example: >>>>> git://.*/.*http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball >>>> The ? could be problematic because it is the separator for the >>>> query. It is unlikely that the user really use this query parameter >>>> but it could complicate the code because we have to handle >>>> additional query parameters. >>>> What does the "?mirrorformat=mirrortarball" mean? Will it work like >>>> a MIRRORTARBALL replacement? >>> >>> The mirrorformat parameter would be used by the mirroring code >>> itself to understand how to handle the url. >> >> What is the different to a MIRRORTARBALL replacement? The code will >> replace the word with the content. >> > Think about this from a usability perspective. We're struggling to > even work out good names for your proposal. Even if we work out the names, We need the good names in any case. The ? is misleading because it is part of a common URL. > I still don't think users are going to understand how to convert urls > into the new syntax. git://.*/.* http://downloads.yoctoproject.org/mirror/sources/?mirrorformat=mirrortarball git://.*/.* http://downloads.yoctoproject.org/mirror/sources/MIRRORTARBALL git://.*/.* http://downloads.yoctoproject.org/mirror/sources/{MIRRORTARBALL} git://.*/.* http://downloads.yoctoproject.org/mirror/sources/download.cgi?filename=?mirrorformat=mirrortarball git://.*/.* http://downloads.yoctoproject.org/mirror/sources/download.cgi?filename=MIRRORTARBALL git://.*/.* http://downloads.yoctoproject.org/mirror/sources/download.cgi?filename={MIRRORTARBALL} > The difference with my proposed format is that we're specifying it in > a way which I suspect users will better understand without needing to > go and read the docs every time. We're saying what we're configuring > with the "mirrorformat" key and then the value should be able to > describe the format. In this case we should add a prefix or use a parameter git://.*/.* scmmirror:http://downloads.yoctoproject.org/mirror/sources/ http?://.*/.* downloadmirror:https://downloads.yoctoproject.org/mirror/sources/ http:// plain:https:// git://.*/.* http://downloads.yoctoproject.org/mirror/sources/;mirrorformat=scm http?://.*/.* downloadmirror:https://downloads.yoctoproject.org/mirror/sources/;mirrorformat=download http:// https://;mirrorformat=plain > My proposal also gives us both a way to clearly detect when obsolete > formatting is used and a namespace mechanism to extend, with both > being in a way we can easily and clearly describe in the docs. But it makes it impossible to support arbitrary regular expressions. https://a.com/b/c/d.tar https://x.com/y/z.tar https://a.com/b/(.*) https://x.com/y/\1 > I appreciate with your proposal we can add more strings and we can add > docs about how to migrate but I suspect users aren't going to be as > readily/easily able to understand it. The main difference is that in your case the user have to read the docs to learn the pre- or postfix and in my case it has to learn the placeholder. In any case the user need to understand that there is a difference between the download and a plain mirror. Either he has to mark the entry and hope that the magic behind work for his use case or he has to place the placeholder at the correct position. git://.*/.* https://downloads.abc.org/mirror/sources/download/MIRRORTARBALL git://.*/.* https://downloads.abc.org/mirror/sources/download.cgi?filename=MIRRORTARBALL git://(.*)/(.*) https://downloads.abc.org/mirror/sources/download/\1/\2/MIRRORTARBALL >>> It would be dropped from the modified url so is only therefore our >>> code's use. If there are additional parameters they would be passed >>> through as they are now. >>> >>> >>>> How does a simple replacement should look like? >>>> >>>> http:// https:// >>>> >>>> Because of the backward compatible this will replace the basename >>>> of the path. >>> >>> It would depend how the mirror is laid out. Some mirrors flatten the >>> urls like DL_DIR is laid out, some potentially don't. The standard >>> usage would likely have a mirrorformat=dldir parameter added. >> >> How does the user specify an entry that replace the http scheme with >> https and keeps everything else like it is (upstream mirror)? >> > > http://.*/.* https://.*/.*?mirrorformat=upstream I assume you mean the following because this doesn't make sense. http:// https://?mirrorformat=upstream What happens if the user need to add the query parameter mirror=1: http:// https://?mirrorformat=upstream?mirror=1 > We need to determine the best value for "upstream". I'd also like to > review whether the .* formatting is the best way to handle this if we > are going to change the format. > >>>>> The possible options would be something like: >>>>> mirrortarball - mirror tarballs taken from DL_DIR >>>>> flattened - copy of DL_DIR so DL_DIR layout (maybe call it dldir?) >>>>> upstream - layout is the same as the upstream directory structure so a direct url replacement >>>> Do you think we have to handle the mirror tarball explicit? The >>>> mirror tarball is required for a scheme change. >>> >>> If we do that, we can avoid having to guess at too many urls to test >>> to figure out a mirror format so I think it would be an improvement >>> on where we are today. >> >> Do you mean we will test if the URL have a mirrortarball and if not >> skip the entry? >> > Correct. But doesn't the mirrortarball depends on the scheme and we doesn't use the entry in anyway? Do you have an example? >>>>> If using a mirrortarball mirror url, we'd know to use the values from >>>>> urldata.mirrortarballs. We could add parameters to the fetcher to have >>>>> two parameters, one will be the DL_DIR path and the other would be the >>>>> upstream url path. >>>> I don't understand where this is needed, because the mirror tarball >>>> and downloadfilename are used by different fetchers. >>> >>> Please keep in mind that downloadfilename is pretty much a >>> misfeature. It was added as we couldn't control collisions inside >>> dl_dir but it creates all kind of other problems. I think we do need >>> to handle that problem case but it does then mean we have to >>> indicate whether any given mirror uses "dldir" or "upstream" names >>> and paths. >> >> I don't understand the problem. The download mirror will use the >> downloadfilename or its default the basename of the localpath. The >> upstream mirror will use the path. The mirrortarball will use the >> mirrortarball. Why the fetcher need two parameters? >> > You are trying to make downloadfilename a supported parameter of every > fetcher. No, I simply harmonize the use of the parameter and remove the problematic assumption that the filename inside the download folder doesn't contain additional folders. A lot of fetcher already support the downloadfilename. Why should I assume that this is wrong and not the desired way. > I'm arguing that I wish we'd never added it at all and that I'd rather > not use it or encourage its use. What is your alternative to handle name clashes? At the moment it is unclear if the usage of the basename or the folders inside the downloadfilename are the bug. The problem is, that even the encoded URI as filename doesn't ensure name clashes because some projects doesn't encode the version inside the URI. The information from the recipe like BP is also problematical. > I don't think you understand the way the fetcher API was written/used Based on the code I assume the code was used to implement a download mirror. Thereby it assume that the download folder use a flat list of files. The problem is that neither the usage of the downloadfilename nor the syntax for the MIRROR variables reflect this. The functions contains some magic inside it. Based on the code I assume there is a need for folders inside the downloadfilename and for the support of upstream mirrors. > > > and this is why some of the patches are still on hold in master-next > until I can convince myself they are actually the right thing to do. > There have been too many other misunderstandings to give me confidence > they're going to do the right thing :(. Sadly, I just don't have the > time do the right level of review and everything else being asked of me. Again my problem is that I don't know the vision. I look at the code and develop a solution which looks reasonable for the different existing use cases. >>>>> One key question I have is how we might need to >>>>> shorted the url path for some mirror urls to add/remove a path prefix >>>>> in the mirroring. >>>> What do you mean by this? The downloadfilename could contain a path >>>> without any problem after my change. >>> >>> See above, downloadfilename is not something I'm keen to promote and >>> is creating several of the problems we have by badly trying to hack >>> extra functionality onto the fetcher without thinking through all >>> the issues like mirroring. >> >> What is the desired way to avoid name clashes? The package manager >> fetcher need an generic way to override the basename. >> > Why does it need that? What do we do if the upstream only provide a single URI to get the last version? > Usually we've used the directory layout to avoid problems where we can > for example. I've been hoping we could do similar here rather than use > downloadfilename, which causes so many mirroring issues in the first > place. Usually? The wget fetcher only use the basename. The npm fetcher use the scope and basename. The crate fetcher the name and version together with a specific file extension. Only the gomod fetcher use the path. It looks like the gomod fetcher only works because it use the path. How should I know that it is the only correct user of the downloadfilename? The mirror code only use the basename of the downloadfilename / path. The download mirror only works if the entry keeps the upstream path. >>>> Alternative we could make some replacement mandatory if a wildcard >>>> is used to detect obsolete entries. >>> >>> I don't understand that. >> >> If we have a wildcard .* in the path we need to know how to replace >> it. This could be the PATH, BASENAME, DOWNLOADFILENAME, MIRRORARCHIVE >> or re group. But in case of the re group this could still be an old >> entry. >> > I'm not entirely sure we want to keep all the different syntax. One > frustration I have with the current code is the way pattern matches > are restricted to that url component for example and I've wondered if > we could/should do something different instead, if we can make it simpler. What is the purpose of the MIRROR. Should it be support any use case (regex) or should it only support specific use cases? Maybe I could simple remove the url components. Until now I simple minimize the changes to simplify the review. >> Do support all cases we need a fix prefix or delimiter: >> >> r:http https >> >> http#https >> http|https >> >> http?://.*/.*|http://downloads.yoctoproject.org/mirror/sources/DOWNLOADARCHIVE >> git://.*/.*|http://downloads.yoctoproject.org/mirror/sources/MIRRORARCHIVE >> >> >>> I do think we have too many problems in the existing mirroring url >>> mapping and we probably need to rework this rather than try and pile >>> more patches into it and complicate it further. >> I have already rework it. If I can remove the backward compatibility >> and replace it with an error this would simplify the code. I only >> need a better name for the DOWNLOADFILENAME (DOWNLOADARCHIVE) and add >> the MIRRORARCHIV. > >> >>> The question is whether the proposal fixes the issues it needs to >>> and has enough simplification and benefit to justify making the change. >> My patches support folders in the downloadfilename, upstream mirrors >> and renames. I have to rework the MIRROR strings but therefore the >> commented out tests work. >> > > > You've created a patch, yes. I don't think it improves usability > though and I think you're also pushing concepts like downloadfilename > into places we might not want to use them too. Until now I don't know the desired solution. Now I think your desired solution is to mimic the upstream in the download folder and this is a requirement for the rework of the mirror code. This means we have to fix the exiting code first and should document this requirement. > For me to merge patches like these, there needs to be a sense of trust > and shared understanding. This simply isn't there, you're just saying > your patches are fine as they are, I disagree. No. I don't understand what problem your proposed solution like the "?" solve. The solutions comes with drawbacks and I want to understand the advantages which balance the disadvantage. > I therefore worry we're at an impasse and are going to struggle to > move beyond this. That does make me quite sad. We haven't an impasse. The problem is the code doesn't reflect your concepts and it's hard for me to identify them from your replies. If I assume that the download folder should mimic the upstream we doesn't need the DOWNLOADARCHIVE because it should be the same as the PATH. But I'm unsure if this works because the PATH isn't enough to unique identify a file and a path not always contain a file extension. https://www.linuxtv.org/hg/dvb-apps/archive/3d43b280298c.tar.bz2;downloadfilename=${BPN}-3d43b280298c.tar.bz2 https://wrapdb.mesonbuild.com/v2/fmt_11.0.2-1/get_patch;downloadfilename=fmt_11.0.2-1_patch.zip http://miniupnp.tuxfamily.org/files/download.php?file=${BP}.tar.gz;downloadfilename=${BP}.tar.gz https://selenic.com/repo/${BPN}/archive/${HG_CHANGESET}.tar.bz2;downloadfilename=${BP}.tar.bz2 http://www.netlib.org/benchmark/dhry-c;downloadfilename=dhry-c.shar https://sourceforge.net/p/giflib/code/ci/d54b45b0240d455bbaedee4be5203d2703e59967/tree/doc/giflib-logo.gif?format=raw;downloadfilename=giflib-logo.gif Additionally the same PATH could be used on different hosts. If we remove the possibility for the user to override the filename inside the download folder we have to ensure that every URI leads to an other filename inside the download folder. We need to generate a file path from the whole URI to append it to the download mirror URI. I suspect we have to define the vision for the download folder first before we could discuses how the mirror entries should look like.
From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com> The mirror replacement syntax contains many implicit transformations. The path of the URI always contains the base name of the downloaded filename. This makes it impossible to rename or remove the base name of the original path. It prevents upstream mirror for SRC_URIS with a downloadfilename parameter. The base name of the downloaded filename makes it impossible to use the download mirror for SRC_URIs with subfolders in the downloadfilename parameter. Altogether the implicit transformation complicates the understanding of the replacements. This series adds an additional replacement named DOWNLOADFILENAME. This replacement contains the relative filename of the downloaded file or mirror archive for git and hg. This allows the user to explicitly define the behavior. The usage is equivalent to the PATH replacement for the sstate mirror from file to https scheme. git://.*/.* http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME https?://.*/.* http://downloads.yoctoproject.org/mirror/sources/DOWNLOADFILENAME file://.* https://sstate.yoctoproject.org/all/PATH;downloadfilename=PATH Without a replacement variable the mirror will use the same base name as the origin SRC_URI. This allows the usage of private package manager registry together with a downloadfilename parameter or the rename of the base name. https://registry.npmjs.org/ https://example.com/npm/registry/ https://example.com/example/1.0.0.tgz https://example.com/example/example-1.0.0.tgz The series adds heuristics to keep a backward compatibility to common styles. Because of the ambiguity of the old style, it is advisable to remove this compatibility sooner or later to avoid unexpected behavior. Stefan Herbrechtsmeier (15): fetch2: remove unnecessary expand function calls fetch2: local: use path variable fetch2: remove unnecessary unquote fetch2: ssh: use common localpath handling fetch2: clearcase: remove double DL_DIR from localfile fetch2: remove basepath from FetchData fetch2: remove basename from FetchData fetch2: use localpath instead of localfile fetch2: make DOWNLOADFILENAME and PATH explicit in mirrors tests: fetch: unify style of replaceuris in MirrorUriTest tests: fetch: fix nonsensical replaceuris in MirrorUriTest tests: fetch: reenable replaceuris in MirrorUriTest tests: fetch: comment and add replaceuris in MirrorUriTest tests: fetch: add npm mirrors replaceuris in MirrorUriTest tests: fetch: fix nonsensical mirror uris in FetcherNetworkTest lib/bb/fetch2/__init__.py | 62 ++++++++++++++++++++------------------ lib/bb/fetch2/az.py | 8 ++--- lib/bb/fetch2/clearcase.py | 2 -- lib/bb/fetch2/gcp.py | 8 ----- lib/bb/fetch2/local.py | 8 ++--- lib/bb/fetch2/npm.py | 2 +- lib/bb/fetch2/repo.py | 2 +- lib/bb/fetch2/s3.py | 7 ----- lib/bb/fetch2/sftp.py | 10 +----- lib/bb/fetch2/ssh.py | 3 +- lib/bb/fetch2/wget.py | 11 +------ lib/bb/tests/fetch.py | 62 ++++++++++++++++++++++++++++++++------ 12 files changed, 98 insertions(+), 87 deletions(-)