diff mbox series

[1/1] fetch2/git: Use basename for gitsrcname when ud.proto is file

Message ID 8121be4437aeae47c5b1d6870e1561b2b088198f.1719033434.git.liezhi.yang@windriver.com
State New
Headers show
Series [1/1] fetch2/git: Use basename for gitsrcname when ud.proto is file | expand

Commit Message

Robert Yang June 22, 2024, 5:20 a.m. UTC
From: Robert Yang <liezhi.yang@windriver.com>

The previous code added local path to generated mirror tarball name which had
two potential issues when fetch from local PREMIRROR, for exapmle:

Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
$ bitbake lsof -cfetch

* There might be "Fix File name too long error" when the local path is long
  since the path is translated to filename and added to downloads/git2/ as:
  path.to.local.PREMIRROR.github.com.lsof-org.lsof

* When enable BB_GENERATE_MIRROR_TARBALLS:
  The tarball name will be:
  git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz

  The above tarball can't be re-used by do_fetch since bitbake looks for
  'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
  the file name breaks the match.

Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
the above two issues, now the file name in downloads/git2/ is:
github.com.lsof-org.lsof

And the generated tarball is:
git2_github.com.lsof-org.lsof.tar.gz

Or when BB_GIT_SHALLOW is enabled:
gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz

And the tarballs can be re-used by do_fetch.

Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
---
 bitbake/lib/bb/fetch2/git.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Richard Purdie July 14, 2024, 11:17 a.m. UTC | #1
On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
> From: Robert Yang <liezhi.yang@windriver.com>
> 
> The previous code added local path to generated mirror tarball name which had
> two potential issues when fetch from local PREMIRROR, for exapmle:
> 
> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
> $ bitbake lsof -cfetch
> 
> * There might be "Fix File name too long error" when the local path is long
>   since the path is translated to filename and added to downloads/git2/ as:
>   path.to.local.PREMIRROR.github.com.lsof-org.lsof
> 
> * When enable BB_GENERATE_MIRROR_TARBALLS:
>   The tarball name will be:
>   git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
> 
>   The above tarball can't be re-used by do_fetch since bitbake looks for
>   'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
>   the file name breaks the match.
> 
> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
> the above two issues, now the file name in downloads/git2/ is:
> github.com.lsof-org.lsof
> 
> And the generated tarball is:
> git2_github.com.lsof-org.lsof.tar.gz
> 
> Or when BB_GIT_SHALLOW is enabled:
> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
> 
> And the tarballs can be re-used by do_fetch.
> 
> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
> ---
>  bitbake/lib/bb/fetch2/git.py | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> index c7ff769fdf..89d3268f7f 100644
> --- a/bitbake/lib/bb/fetch2/git.py
> +++ b/bitbake/lib/bb/fetch2/git.py
> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>                      ud.unresolvedrev[name] = ud.revisions[name]
>                  ud.revisions[name] = self.latest_revision(ud, d, name)
>  
> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
> +        if ud.proto == "file":
> +            gitsrcname = '%s' % os.path.basename(ud.path)
> +        else:
> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>          if gitsrcname.startswith('.'):
>              gitsrcname = gitsrcname[1:]
>  

I started to review this, then I started wondering why we have mirror
tarballs being created for local file:// urls? Does that make sense?

Cheers,

Richard
Robert Yang July 14, 2024, 1:03 p.m. UTC | #2
Hi RP,

On 7/14/24 19:17, Richard Purdie wrote:
> On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
>> From: Robert Yang <liezhi.yang@windriver.com>
>>
>> The previous code added local path to generated mirror tarball name which had
>> two potential issues when fetch from local PREMIRROR, for exapmle:
>>
>> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
>> $ bitbake lsof -cfetch
>>
>> * There might be "Fix File name too long error" when the local path is long
>>    since the path is translated to filename and added to downloads/git2/ as:
>>    path.to.local.PREMIRROR.github.com.lsof-org.lsof
>>
>> * When enable BB_GENERATE_MIRROR_TARBALLS:
>>    The tarball name will be:
>>    git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
>>
>>    The above tarball can't be re-used by do_fetch since bitbake looks for
>>    'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
>>    the file name breaks the match.
>>
>> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
>> the above two issues, now the file name in downloads/git2/ is:
>> github.com.lsof-org.lsof
>>
>> And the generated tarball is:
>> git2_github.com.lsof-org.lsof.tar.gz
>>
>> Or when BB_GIT_SHALLOW is enabled:
>> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
>>
>> And the tarballs can be re-used by do_fetch.
>>
>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>> ---
>>   bitbake/lib/bb/fetch2/git.py | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
>> index c7ff769fdf..89d3268f7f 100644
>> --- a/bitbake/lib/bb/fetch2/git.py
>> +++ b/bitbake/lib/bb/fetch2/git.py
>> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>>                       ud.unresolvedrev[name] = ud.revisions[name]
>>                   ud.revisions[name] = self.latest_revision(ud, d, name)
>>   
>> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>> +        if ud.proto == "file":
>> +            gitsrcname = '%s' % os.path.basename(ud.path)
>> +        else:
>> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>>           if gitsrcname.startswith('.'):
>>               gitsrcname = gitsrcname[1:]
>>   
> 
> I started to review this, then I started wondering why we have mirror
> tarballs being created for local file:// urls? Does that make sense?

This patch fixed two issues:

1) "File name too long error" when fetch from PREMIRROR
2) Git mirror tarball contains local path when file://

Create git mirror tarball for file:// doesn't make much sense, but the problem
is the first issue, there would be the error when len(path) > 255 (The NAME_MAX).

// Robert

> 
> Cheers,
> 
> Richard
Richard Purdie July 14, 2024, 1:32 p.m. UTC | #3
On Sun, 2024-07-14 at 21:03 +0800, Robert Yang wrote:
> Hi RP,
> 
> On 7/14/24 19:17, Richard Purdie wrote:
> > On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via
> > lists.openembedded.org wrote:
> > > From: Robert Yang <liezhi.yang@windriver.com>
> > > 
> > > The previous code added local path to generated mirror tarball
> > > name which had
> > > two potential issues when fetch from local PREMIRROR, for
> > > exapmle:
> > > 
> > > Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
> > > $ bitbake lsof -cfetch
> > > 
> > > * There might be "Fix File name too long error" when the local
> > > path is long
> > >    since the path is translated to filename and added to
> > > downloads/git2/ as:
> > >    path.to.local.PREMIRROR.github.com.lsof-org.lsof
> > > 
> > > * When enable BB_GENERATE_MIRROR_TARBALLS:
> > >    The tarball name will be:
> > >    git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
> > > 
> > >    The above tarball can't be re-used by do_fetch since bitbake
> > > looks for
> > >    'git2_github.com.lsof-org.lsof.tar.gz', but the
> > > 'path.to.local.PREMIRROR' in
> > >    the file name breaks the match.
> > > 
> > > Use os.path.basename(ud.path) as gitsrcname when ud.proto is file
> > > can fix the
> > > the above two issues, now the file name in downloads/git2/ is:
> > > github.com.lsof-org.lsof
> > > 
> > > And the generated tarball is:
> > > git2_github.com.lsof-org.lsof.tar.gz
> > > 
> > > Or when BB_GIT_SHALLOW is enabled:
> > > gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
> > > 
> > > And the tarballs can be re-used by do_fetch.
> > > 
> > > Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
> > > ---
> > >   bitbake/lib/bb/fetch2/git.py | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/bitbake/lib/bb/fetch2/git.py
> > > b/bitbake/lib/bb/fetch2/git.py
> > > index c7ff769fdf..89d3268f7f 100644
> > > --- a/bitbake/lib/bb/fetch2/git.py
> > > +++ b/bitbake/lib/bb/fetch2/git.py
> > > @@ -277,7 +277,10 @@ class Git(FetchMethod):
> > >                       ud.unresolvedrev[name] = ud.revisions[name]
> > >                   ud.revisions[name] = self.latest_revision(ud,
> > > d, name)
> > >   
> > > -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'),
> > > ud.path.replace('/', '.').replace('*', '.').replace('
> > > ','_').replace('(', '_').replace(')', '_'))
> > > +        if ud.proto == "file":
> > > +            gitsrcname = '%s' % os.path.basename(ud.path)
> > > +        else:
> > > +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'),
> > > ud.path.replace('/', '.').replace('*', '.').replace('
> > > ','_').replace('(', '_').replace(')', '_'))
> > >           if gitsrcname.startswith('.'):
> > >               gitsrcname = gitsrcname[1:]
> > >   
> > 
> > I started to review this, then I started wondering why we have
> > mirror
> > tarballs being created for local file:// urls? Does that make
> > sense?
> 
> This patch fixed two issues:
> 
> 1) "File name too long error" when fetch from PREMIRROR
> 2) Git mirror tarball contains local path when file://
> 
> Create git mirror tarball for file:// doesn't make much sense, but
> the problem
> is the first issue, there would be the error when len(path) > 255
> (The NAME_MAX).

I understand that. However:

a) why it is looking for a file:// url in PREMIRRORS?
b) why does it consider mirror tarballs for a file:// url?

I'm not sure it should be doing either of those things?

Cheers,

Richard
Robert Yang July 14, 2024, 1:54 p.m. UTC | #4
Hi RP,

On 7/14/24 21:32, Richard Purdie via lists.openembedded.org wrote:
> On Sun, 2024-07-14 at 21:03 +0800, Robert Yang wrote:
>> Hi RP,
>>
>> On 7/14/24 19:17, Richard Purdie wrote:
>>> On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via
>>> lists.openembedded.org wrote:
>>>> From: Robert Yang <liezhi.yang@windriver.com>
>>>>
>>>> The previous code added local path to generated mirror tarball
>>>> name which had
>>>> two potential issues when fetch from local PREMIRROR, for
>>>> exapmle:
>>>>
>>>> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
>>>> $ bitbake lsof -cfetch
>>>>
>>>> * There might be "Fix File name too long error" when the local
>>>> path is long
>>>>     since the path is translated to filename and added to
>>>> downloads/git2/ as:
>>>>     path.to.local.PREMIRROR.github.com.lsof-org.lsof
>>>>
>>>> * When enable BB_GENERATE_MIRROR_TARBALLS:
>>>>     The tarball name will be:
>>>>     git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
>>>>
>>>>     The above tarball can't be re-used by do_fetch since bitbake
>>>> looks for
>>>>     'git2_github.com.lsof-org.lsof.tar.gz', but the
>>>> 'path.to.local.PREMIRROR' in
>>>>     the file name breaks the match.
>>>>
>>>> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file
>>>> can fix the
>>>> the above two issues, now the file name in downloads/git2/ is:
>>>> github.com.lsof-org.lsof
>>>>
>>>> And the generated tarball is:
>>>> git2_github.com.lsof-org.lsof.tar.gz
>>>>
>>>> Or when BB_GIT_SHALLOW is enabled:
>>>> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
>>>>
>>>> And the tarballs can be re-used by do_fetch.
>>>>
>>>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>>>> ---
>>>>    bitbake/lib/bb/fetch2/git.py | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/bitbake/lib/bb/fetch2/git.py
>>>> b/bitbake/lib/bb/fetch2/git.py
>>>> index c7ff769fdf..89d3268f7f 100644
>>>> --- a/bitbake/lib/bb/fetch2/git.py
>>>> +++ b/bitbake/lib/bb/fetch2/git.py
>>>> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>>>>                        ud.unresolvedrev[name] = ud.revisions[name]
>>>>                    ud.revisions[name] = self.latest_revision(ud,
>>>> d, name)
>>>>    
>>>> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'),
>>>> ud.path.replace('/', '.').replace('*', '.').replace('
>>>> ','_').replace('(', '_').replace(')', '_'))
>>>> +        if ud.proto == "file":
>>>> +            gitsrcname = '%s' % os.path.basename(ud.path)
>>>> +        else:
>>>> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'),
>>>> ud.path.replace('/', '.').replace('*', '.').replace('
>>>> ','_').replace('(', '_').replace(')', '_'))
>>>>            if gitsrcname.startswith('.'):
>>>>                gitsrcname = gitsrcname[1:]
>>>>    
>>>
>>> I started to review this, then I started wondering why we have
>>> mirror
>>> tarballs being created for local file:// urls? Does that make
>>> sense?
>>
>> This patch fixed two issues:
>>
>> 1) "File name too long error" when fetch from PREMIRROR
>> 2) Git mirror tarball contains local path when file://
>>
>> Create git mirror tarball for file:// doesn't make much sense, but
>> the problem
>> is the first issue, there would be the error when len(path) > 255
>> (The NAME_MAX).
> 
> I understand that. However:
> 
> a) why it is looking for a file:// url in PREMIRRORS?

This is a little WRLinux specific, we release git repos with the product,
and set PREMIRROS as:

PREMIRRORS:append = " \
      git://.*/.* git://${LAYERDIR}/git/MIRRORNAME;protocol=file \n \
      gitsm://.*/.* gitsm://${LAYERDIR}/git/MIRRORNAME;protocol=file \n \


> b) why does it consider mirror tarballs for a file:// url?

I found the issue when I tried to generate shallow tarballs from the above 
PREMIRRORS.

// Robert.

> 
> I'm not sure it should be doing either of those things?
> 
> Cheers,
> 
> Richard
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#16421): https://lists.openembedded.org/g/bitbake-devel/message/16421
> Mute This Topic: https://lists.openembedded.org/mt/106812686/3616940
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [liezhi.yang@windriver.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Richard Purdie Aug. 6, 2024, 11:06 a.m. UTC | #5
On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
> From: Robert Yang <liezhi.yang@windriver.com>
> 
> The previous code added local path to generated mirror tarball name which had
> two potential issues when fetch from local PREMIRROR, for exapmle:
> 
> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
> $ bitbake lsof -cfetch
> 
> * There might be "Fix File name too long error" when the local path is long
>   since the path is translated to filename and added to downloads/git2/ as:
>   path.to.local.PREMIRROR.github.com.lsof-org.lsof
> 
> * When enable BB_GENERATE_MIRROR_TARBALLS:
>   The tarball name will be:
>   git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
> 
>   The above tarball can't be re-used by do_fetch since bitbake looks for
>   'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
>   the file name breaks the match.
> 
> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
> the above two issues, now the file name in downloads/git2/ is:
> github.com.lsof-org.lsof
> 
> And the generated tarball is:
> git2_github.com.lsof-org.lsof.tar.gz
> 
> Or when BB_GIT_SHALLOW is enabled:
> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
> 
> And the tarballs can be re-used by do_fetch.
> 
> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
> ---
>  bitbake/lib/bb/fetch2/git.py | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> index c7ff769fdf..89d3268f7f 100644
> --- a/bitbake/lib/bb/fetch2/git.py
> +++ b/bitbake/lib/bb/fetch2/git.py
> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>                      ud.unresolvedrev[name] = ud.revisions[name]
>                  ud.revisions[name] = self.latest_revision(ud, d, name)
>  
> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
> +        if ud.proto == "file":
> +            gitsrcname = '%s' % os.path.basename(ud.path)
> +        else:
> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>          if gitsrcname.startswith('.'):
>              gitsrcname = gitsrcname[1:]


I did look at this patch again. I'm still not convinced about the
original use case but ignoring that, the patch itself isn't too safe.
If you have two repositories with the same name, this can break since
basename() on the URI is not a unique value. 

Just as a note, '%s' % x is also poor code as you can just assign the
value.

Cheers,

Richard
Ross Burton Aug. 6, 2024, 11:20 a.m. UTC | #6
> On 6 Aug 2024, at 12:06, Richard Purdie via lists.openembedded.org <richard.purdie=linuxfoundation.org@lists.openembedded.org> wrote:
> I did look at this patch again. I'm still not convinced about the
> original use case but ignoring that, the patch itself isn't too safe.
> If you have two repositories with the same name, this can break since
> basename() on the URI is not a unique value. 

I just tested this and verified this is the behaviour.  With a recipe that fetches a local git repo called xyzzy it currently does a clone to $DL_DIR/git2/home.rosbur01.xyzzy but this changes that path to $DL_DIR/git2/xyzzy.  My fear is that this is too ambiguous for a DL_DIR path.

Ross
Robert Yang Aug. 6, 2024, 1:42 p.m. UTC | #7
Hi RP and Ross,

On 8/6/24 19:06, Richard Purdie wrote:
> On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
>> From: Robert Yang <liezhi.yang@windriver.com>
>>
>> The previous code added local path to generated mirror tarball name which had
>> two potential issues when fetch from local PREMIRROR, for exapmle:
>>
>> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
>> $ bitbake lsof -cfetch
>>
>> * There might be "Fix File name too long error" when the local path is long
>>    since the path is translated to filename and added to downloads/git2/ as:
>>    path.to.local.PREMIRROR.github.com.lsof-org.lsof
>>
>> * When enable BB_GENERATE_MIRROR_TARBALLS:
>>    The tarball name will be:
>>    git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
>>
>>    The above tarball can't be re-used by do_fetch since bitbake looks for
>>    'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
>>    the file name breaks the match.
>>
>> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
>> the above two issues, now the file name in downloads/git2/ is:
>> github.com.lsof-org.lsof
>>
>> And the generated tarball is:
>> git2_github.com.lsof-org.lsof.tar.gz
>>
>> Or when BB_GIT_SHALLOW is enabled:
>> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
>>
>> And the tarballs can be re-used by do_fetch.
>>
>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>> ---
>>   bitbake/lib/bb/fetch2/git.py | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
>> index c7ff769fdf..89d3268f7f 100644
>> --- a/bitbake/lib/bb/fetch2/git.py
>> +++ b/bitbake/lib/bb/fetch2/git.py
>> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>>                       ud.unresolvedrev[name] = ud.revisions[name]
>>                   ud.revisions[name] = self.latest_revision(ud, d, name)
>>   
>> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>> +        if ud.proto == "file":
>> +            gitsrcname = '%s' % os.path.basename(ud.path)
>> +        else:
>> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>>           if gitsrcname.startswith('.'):
>>               gitsrcname = gitsrcname[1:]
> 
> 
> I did look at this patch again. I'm still not convinced about the
> original use case but ignoring that, the patch itself isn't too safe.

Thanks for looking into this, the main problem is that we need a way to fix the
filename too long error as described in the commit message above.

> If you have two repositories with the same name, this can break since
> basename() on the URI is not a unique value.

If the two repos are identical, then it isn't a problem since bitbake
can handle them correctly, if the two local repos with the same
name but have different contents, I think that the user should fix their repos.

> 
> Just as a note, '%s' % x is also poor code as you can just assign the
> value.

Yes, absolutely, that's my mistake.

// Robert

> 
> Cheers,
> 
> Richard
> 
> 
>
Robert Yang Aug. 6, 2024, 1:50 p.m. UTC | #8
On 8/6/24 21:42, Robert Yang via lists.openembedded.org wrote:
> Hi RP and Ross,

Sorry, correct Ross' email.

// Robert

> 
> On 8/6/24 19:06, Richard Purdie wrote:
>> On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
>>> From: Robert Yang <liezhi.yang@windriver.com>
>>>
>>> The previous code added local path to generated mirror tarball name which had
>>> two potential issues when fetch from local PREMIRROR, for exapmle:
>>>
>>> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
>>> $ bitbake lsof -cfetch
>>>
>>> * There might be "Fix File name too long error" when the local path is long
>>>    since the path is translated to filename and added to downloads/git2/ as:
>>>    path.to.local.PREMIRROR.github.com.lsof-org.lsof
>>>
>>> * When enable BB_GENERATE_MIRROR_TARBALLS:
>>>    The tarball name will be:
>>>    git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
>>>
>>>    The above tarball can't be re-used by do_fetch since bitbake looks for
>>>    'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
>>>    the file name breaks the match.
>>>
>>> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
>>> the above two issues, now the file name in downloads/git2/ is:
>>> github.com.lsof-org.lsof
>>>
>>> And the generated tarball is:
>>> git2_github.com.lsof-org.lsof.tar.gz
>>>
>>> Or when BB_GIT_SHALLOW is enabled:
>>> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
>>>
>>> And the tarballs can be re-used by do_fetch.
>>>
>>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>>> ---
>>>   bitbake/lib/bb/fetch2/git.py | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
>>> index c7ff769fdf..89d3268f7f 100644
>>> --- a/bitbake/lib/bb/fetch2/git.py
>>> +++ b/bitbake/lib/bb/fetch2/git.py
>>> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>>>                       ud.unresolvedrev[name] = ud.revisions[name]
>>>                   ud.revisions[name] = self.latest_revision(ud, d, name)
>>> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), 
>>> ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', 
>>> '_').replace(')', '_'))
>>> +        if ud.proto == "file":
>>> +            gitsrcname = '%s' % os.path.basename(ud.path)
>>> +        else:
>>> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), 
>>> ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', 
>>> '_').replace(')', '_'))
>>>           if gitsrcname.startswith('.'):
>>>               gitsrcname = gitsrcname[1:]
>>
>>
>> I did look at this patch again. I'm still not convinced about the
>> original use case but ignoring that, the patch itself isn't too safe.
> 
> Thanks for looking into this, the main problem is that we need a way to fix the
> filename too long error as described in the commit message above.
> 
>> If you have two repositories with the same name, this can break since
>> basename() on the URI is not a unique value.
> 
> If the two repos are identical, then it isn't a problem since bitbake
> can handle them correctly, if the two local repos with the same
> name but have different contents, I think that the user should fix their repos.
> 
>>
>> Just as a note, '%s' % x is also poor code as you can just assign the
>> value.
> 
> Yes, absolutely, that's my mistake.
> 
> // Robert
> 
>>
>> Cheers,
>>
>> Richard
>>
>>
>>
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#16447): https://lists.openembedded.org/g/bitbake-devel/message/16447
> Mute This Topic: https://lists.openembedded.org/mt/106812686/3616940
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [liezhi.yang@windriver.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Richard Purdie Aug. 6, 2024, 2:10 p.m. UTC | #9
On Tue, 2024-08-06 at 21:42 +0800, Robert Yang wrote:
> Hi RP and Ross,
> 
> On 8/6/24 19:06, Richard Purdie wrote:
> > On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
> > > From: Robert Yang <liezhi.yang@windriver.com>
> > > 
> > > The previous code added local path to generated mirror tarball name which had
> > > two potential issues when fetch from local PREMIRROR, for exapmle:
> > > 
> > > Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
> > > $ bitbake lsof -cfetch
> > > 
> > > * There might be "Fix File name too long error" when the local path is long
> > >    since the path is translated to filename and added to downloads/git2/ as:
> > >    path.to.local.PREMIRROR.github.com.lsof-org.lsof
> > > 
> > > * When enable BB_GENERATE_MIRROR_TARBALLS:
> > >    The tarball name will be:
> > >    git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
> > > 
> > >    The above tarball can't be re-used by do_fetch since bitbake looks for
> > >    'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
> > >    the file name breaks the match.
> > > 
> > > Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
> > > the above two issues, now the file name in downloads/git2/ is:
> > > github.com.lsof-org.lsof
> > > 
> > > And the generated tarball is:
> > > git2_github.com.lsof-org.lsof.tar.gz
> > > 
> > > Or when BB_GIT_SHALLOW is enabled:
> > > gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
> > > 
> > > And the tarballs can be re-used by do_fetch.
> > > 
> > > Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
> > > ---
> > >   bitbake/lib/bb/fetch2/git.py | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> > > index c7ff769fdf..89d3268f7f 100644
> > > --- a/bitbake/lib/bb/fetch2/git.py
> > > +++ b/bitbake/lib/bb/fetch2/git.py
> > > @@ -277,7 +277,10 @@ class Git(FetchMethod):
> > >                       ud.unresolvedrev[name] = ud.revisions[name]
> > >                   ud.revisions[name] = self.latest_revision(ud, d, name)
> > >   
> > > -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
> > > +        if ud.proto == "file":
> > > +            gitsrcname = '%s' % os.path.basename(ud.path)
> > > +        else:
> > > +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
> > >           if gitsrcname.startswith('.'):
> > >               gitsrcname = gitsrcname[1:]
> > 
> > 
> > I did look at this patch again. I'm still not convinced about the
> > original use case but ignoring that, the patch itself isn't too safe.
> 
> Thanks for looking into this, the main problem is that we need a way to fix the
> filename too long error as described in the commit message above.
> 
> > If you have two repositories with the same name, this can break since
> > basename() on the URI is not a unique value.
> 
> If the two repos are identical, then it isn't a problem since bitbake
> can handle them correctly, if the two local repos with the same
> name but have different contents, I think that the user should fix their repos.

The trouble is that this doesn't just affect local repos, the contents
of DL_DIR can and will be shared as a public mirror so it is any two
repos with the same name, not just two local repos.

This is partly why I still think mirrors of local file repos don't make
sense.

Cheers,

Richard
Robert Yang Aug. 7, 2024, 2:23 a.m. UTC | #10
Hi RP,

On 8/6/24 22:10, Richard Purdie wrote:
> On Tue, 2024-08-06 at 21:42 +0800, Robert Yang wrote:
>> Hi RP and Ross,
>>
>> On 8/6/24 19:06, Richard Purdie wrote:
>>> On Fri, 2024-06-21 at 22:20 -0700, Robert Yang via lists.openembedded.org wrote:
>>>> From: Robert Yang <liezhi.yang@windriver.com>
>>>>
>>>> The previous code added local path to generated mirror tarball name which had
>>>> two potential issues when fetch from local PREMIRROR, for exapmle:
>>>>
>>>> Add github.com.lsof-org.lsof to /path/to/local/PREMIRROR/
>>>> $ bitbake lsof -cfetch
>>>>
>>>> * There might be "Fix File name too long error" when the local path is long
>>>>     since the path is translated to filename and added to downloads/git2/ as:
>>>>     path.to.local.PREMIRROR.github.com.lsof-org.lsof
>>>>
>>>> * When enable BB_GENERATE_MIRROR_TARBALLS:
>>>>     The tarball name will be:
>>>>     git2_path.to.local.PREMIRROR.github.com.lsof-org.lsof.tar.gz
>>>>
>>>>     The above tarball can't be re-used by do_fetch since bitbake looks for
>>>>     'git2_github.com.lsof-org.lsof.tar.gz', but the 'path.to.local.PREMIRROR' in
>>>>     the file name breaks the match.
>>>>
>>>> Use os.path.basename(ud.path) as gitsrcname when ud.proto is file can fix the
>>>> the above two issues, now the file name in downloads/git2/ is:
>>>> github.com.lsof-org.lsof
>>>>
>>>> And the generated tarball is:
>>>> git2_github.com.lsof-org.lsof.tar.gz
>>>>
>>>> Or when BB_GIT_SHALLOW is enabled:
>>>> gitshallow_github.com.lsof-org.lsof_2e4c7a1-1_master.tar.gz
>>>>
>>>> And the tarballs can be re-used by do_fetch.
>>>>
>>>> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
>>>> ---
>>>>    bitbake/lib/bb/fetch2/git.py | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
>>>> index c7ff769fdf..89d3268f7f 100644
>>>> --- a/bitbake/lib/bb/fetch2/git.py
>>>> +++ b/bitbake/lib/bb/fetch2/git.py
>>>> @@ -277,7 +277,10 @@ class Git(FetchMethod):
>>>>                        ud.unresolvedrev[name] = ud.revisions[name]
>>>>                    ud.revisions[name] = self.latest_revision(ud, d, name)
>>>>    
>>>> -        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>>>> +        if ud.proto == "file":
>>>> +            gitsrcname = '%s' % os.path.basename(ud.path)
>>>> +        else:
>>>> +            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
>>>>            if gitsrcname.startswith('.'):
>>>>                gitsrcname = gitsrcname[1:]
>>>
>>>
>>> I did look at this patch again. I'm still not convinced about the
>>> original use case but ignoring that, the patch itself isn't too safe.
>>
>> Thanks for looking into this, the main problem is that we need a way to fix the
>> filename too long error as described in the commit message above.
>>
>>> If you have two repositories with the same name, this can break since
>>> basename() on the URI is not a unique value.
>>
>> If the two repos are identical, then it isn't a problem since bitbake
>> can handle them correctly, if the two local repos with the same
>> name but have different contents, I think that the user should fix their repos.
> 
> The trouble is that this doesn't just affect local repos, the contents
> of DL_DIR can and will be shared as a public mirror so it is any two
> repos with the same name, not just two local repos.

This only affects protocol=file, without this patch, the

git://path/to/local/path/gitrepo.git:protocol=file

after do_fetch, it will be:

downloads/git2/path.to.local.path.gitrepo.git

The above name can't be used on public mirror because it can't be matched,
unless different people and projects use the same local path, but this isn't the
reality, for example, my local path is:

git://home/robert/to/gitrepo.git;protocol=file

When I upload it to a public mirror, it will be:
home.robert.to.gitrepo.git

For you, it might be:
git://home/rp/to/gitrepo.git;protocol=file

What your do_fetch looking for is:
home.rp.to.gitrepo.git

So it can't be fetched from public mirror, unless you use:
git://home/robert/to/gitrepo.git;protocol=file

But this isn't the reality

> 
> This is partly why I still think mirrors of local file repos don't make
> sense.

It's not only for mirror, but will cause "Filename too long error" for normal
build's do_fetch when:

len(/home/robert/to/long/dir/gitrepo.git) > 256

You can simulate the error with the following command on Ubuntu
20.04 or 22.04:
$ touch $(for i in $(seq 1 30); do echo -n longpart$i; done)

File name too long

This is because the current code changes the path to filename, and
len(filename) > 256.

// Robert


> 
> Cheers,
> 
> Richard
>
Richard Purdie Aug. 7, 2024, 2:58 p.m. UTC | #11
On Wed, 2024-08-07 at 10:23 +0800, Robert Yang via
lists.openembedded.org wrote:
> Hi RP,
> 
> On 8/6/24 22:10, Richard Purdie wrote:
> > On Tue, 2024-08-06 at 21:42 +0800, Robert Yang wrote:
> > 
> > The trouble is that this doesn't just affect local repos, the
> > contents
> > of DL_DIR can and will be shared as a public mirror so it is any
> > two
> > repos with the same name, not just two local repos.
> 
> This only affects protocol=file, without this patch, the
> 
> git://path/to/local/path/gitrepo.git:protocol=file
> 
> after do_fetch, it will be:
> 
> downloads/git2/path.to.local.path.gitrepo.git

Currently we don't support mirroring of file protocol git urls as they
don't really make sense.

> The above name can't be used on public mirror because it can't be
> matched, unless different people and projects use the same local
> path, but this isn't the reality, for example, my local path is:
> 
> git://home/robert/to/gitrepo.git;protocol=file

Correct, but the path is uniquely identified.

> When I upload it to a public mirror, it will be:
> home.robert.to.gitrepo.git
> 
> For you, it might be:
> git://home/rp/to/gitrepo.git;protocol=file
> 
> What your do_fetch looking for is:
> home.rp.to.gitrepo.git
> 
> So it can't be fetched from public mirror, unless you use:
> git://home/robert/to/gitrepo.git;protocol=file
> 
> But this isn't the reality


Agreed, they simply don't work today and are not supported.

> > This is partly why I still think mirrors of local file repos don't
> > make
> > sense.
> 
> It's not only for mirror, but will cause "Filename too long error"
> for normal build's do_fetch when:
> 
> len(/home/robert/to/long/dir/gitrepo.git) > 256
> 
> You can simulate the error with the following command on Ubuntu
> 20.04 or 22.04:
> $ touch $(for i in $(seq 1 30); do echo -n longpart$i; done)
> 
> File name too long
> 
> This is because the current code changes the path to filename, and
> len(filename) > 256.

Sure, but this isn't a good reason to adopt an unsafe naming scheme for
something we don't currently support and don't really want to add
support for.

Cheers,

Richard
diff mbox series

Patch

diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
index c7ff769fdf..89d3268f7f 100644
--- a/bitbake/lib/bb/fetch2/git.py
+++ b/bitbake/lib/bb/fetch2/git.py
@@ -277,7 +277,10 @@  class Git(FetchMethod):
                     ud.unresolvedrev[name] = ud.revisions[name]
                 ud.revisions[name] = self.latest_revision(ud, d, name)
 
-        gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
+        if ud.proto == "file":
+            gitsrcname = '%s' % os.path.basename(ud.path)
+        else:
+            gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
         if gitsrcname.startswith('.'):
             gitsrcname = gitsrcname[1:]