Message ID | 20230913084344.29928-1-mikko.rapeli@linaro.org |
---|---|
State | New |
Headers | show |
Series | bitbake: fetch2: git: revert path checks | expand |
On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: > Commit 10f38157a069cb6938323cadd5e523037a29ace9 > triggers re-clones of uptodate repositories when their > access path changes. This is wrong. It is enough that > needed commits are found from the git repo. For example > when same DL_DIR is accessed using two different paths > due to a symlink, then the resulting absolute paths will > differ but the content is the same. Trust the content. > > Avoids extra downloads of large repositories like linux-yocto > for kernel when the local cache was already uptodate. > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > --- > bitbake/lib/bb/fetch2/git.py | 14 +------------- > 1 file changed, 1 insertion(+), 13 deletions(-) > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py > index e11271b757..0c547ea303 100644 > --- a/bitbake/lib/bb/fetch2/git.py > +++ b/bitbake/lib/bb/fetch2/git.py > @@ -371,19 +371,7 @@ class Git(FetchMethod): > # The directory may exist, but not be the top level of a bare git > # repository in which case it needs to be deleted and re-cloned. > try: > - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel > - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > - > - toplevel = os.path.abspath(output.rstrip()) > - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') > - # The top level Git directory must either be the clone directory > - # or a child of the clone directory. Any ancestor directory of > - # the clone directory is not valid as the Git directory (and > - # probably belongs to some other unrelated repository), so a > - # clone is required > - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: > - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > - needs_clone = True > + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > except bb.fetch2.FetchError as e: > logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) > needs_clone = True Doesn't this give rise to some of the original issues (it could find a parent git repo instead of the one being cloned)? Cheers, Richard
Hi, On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 > > triggers re-clones of uptodate repositories when their > > access path changes. This is wrong. It is enough that > > needed commits are found from the git repo. For example > > when same DL_DIR is accessed using two different paths > > due to a symlink, then the resulting absolute paths will > > differ but the content is the same. Trust the content. > > > > Avoids extra downloads of large repositories like linux-yocto > > for kernel when the local cache was already uptodate. > > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > > --- > > bitbake/lib/bb/fetch2/git.py | 14 +------------- > > 1 file changed, 1 insertion(+), 13 deletions(-) > > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py > > index e11271b757..0c547ea303 100644 > > --- a/bitbake/lib/bb/fetch2/git.py > > +++ b/bitbake/lib/bb/fetch2/git.py > > @@ -371,19 +371,7 @@ class Git(FetchMethod): > > # The directory may exist, but not be the top level of a bare git > > # repository in which case it needs to be deleted and re-cloned. > > try: > > - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel > > - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > - > > - toplevel = os.path.abspath(output.rstrip()) > > - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') > > - # The top level Git directory must either be the clone directory > > - # or a child of the clone directory. Any ancestor directory of > > - # the clone directory is not valid as the Git directory (and > > - # probably belongs to some other unrelated repository), so a > > - # clone is required > > - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: > > - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > > - needs_clone = True > > + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > except bb.fetch2.FetchError as e: > > logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) > > needs_clone = True > > Doesn't this give rise to some of the original issues (it could find a > parent git repo instead of the one being cloned)? Yes, but this has not been a problem for me. Re-cloning existing fully cached repositories is the effect I see. I think the git commit id and branch should decide if fetch or re-clone steps are needed. The path of the repository should not matter. The git remotes also not. Both of those can and do change. Cheers, -Mikko
On Wed, 2023-09-13 at 12:04 +0300, Mikko Rapeli wrote: > Hi, > > On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: > > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: > > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 > > > triggers re-clones of uptodate repositories when their > > > access path changes. This is wrong. It is enough that > > > needed commits are found from the git repo. For example > > > when same DL_DIR is accessed using two different paths > > > due to a symlink, then the resulting absolute paths will > > > differ but the content is the same. Trust the content. > > > > > > Avoids extra downloads of large repositories like linux-yocto > > > for kernel when the local cache was already uptodate. > > > > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > > > --- > > > bitbake/lib/bb/fetch2/git.py | 14 +------------- > > > 1 file changed, 1 insertion(+), 13 deletions(-) > > > > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py > > > index e11271b757..0c547ea303 100644 > > > --- a/bitbake/lib/bb/fetch2/git.py > > > +++ b/bitbake/lib/bb/fetch2/git.py > > > @@ -371,19 +371,7 @@ class Git(FetchMethod): > > > # The directory may exist, but not be the top level of a bare git > > > # repository in which case it needs to be deleted and re-cloned. > > > try: > > > - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel > > > - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > > - > > > - toplevel = os.path.abspath(output.rstrip()) > > > - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') > > > - # The top level Git directory must either be the clone directory > > > - # or a child of the clone directory. Any ancestor directory of > > > - # the clone directory is not valid as the Git directory (and > > > - # probably belongs to some other unrelated repository), so a > > > - # clone is required > > > - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: > > > - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > > > - needs_clone = True > > > + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > > except bb.fetch2.FetchError as e: > > > logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) > > > needs_clone = True > > > > Doesn't this give rise to some of the original issues (it could find a > > parent git repo instead of the one being cloned)? > > Yes, but this has not been a problem for me. Re-cloning existing fully cached > repositories is the effect I see. > > I think the git commit id and branch should decide if fetch or re-clone steps are needed. > The path of the repository should not matter. The git remotes also not. Both of those can > and do change. In that case you're effectively sending a revert and we should probably be clear about that... We really need to find a way to resolve things for everyone. Cheers, Richard
FWIW: I'm also seeing warnings like: WARNING: Top level directory '/data001/work/downloads/git2/git.libssh.org.projects.libssh.git' doesn't match expected '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'. Re-cloning Where /mnt/mirror-write/downloads/ is configured DL_DIR and /data001/work/downloads/ was probably the DL_DIR of the jenkins jobs which originally fetched the libssh and created the tarball with it for premirror. Cheers, On Wed, Sep 13, 2023 at 11:22 AM Richard Purdie < richard.purdie@linuxfoundation.org> wrote: > On Wed, 2023-09-13 at 12:04 +0300, Mikko Rapeli wrote: > > Hi, > > > > On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: > > > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: > > > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 > > > > triggers re-clones of uptodate repositories when their > > > > access path changes. This is wrong. It is enough that > > > > needed commits are found from the git repo. For example > > > > when same DL_DIR is accessed using two different paths > > > > due to a symlink, then the resulting absolute paths will > > > > differ but the content is the same. Trust the content. > > > > > > > > Avoids extra downloads of large repositories like linux-yocto > > > > for kernel when the local cache was already uptodate. > > > > > > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > > > > --- > > > > bitbake/lib/bb/fetch2/git.py | 14 +------------- > > > > 1 file changed, 1 insertion(+), 13 deletions(-) > > > > > > > > diff --git a/bitbake/lib/bb/fetch2/git.py > b/bitbake/lib/bb/fetch2/git.py > > > > index e11271b757..0c547ea303 100644 > > > > --- a/bitbake/lib/bb/fetch2/git.py > > > > +++ b/bitbake/lib/bb/fetch2/git.py > > > > @@ -371,19 +371,7 @@ class Git(FetchMethod): > > > > # The directory may exist, but not be the top level of > a bare git > > > > # repository in which case it needs to be deleted and > re-cloned. > > > > try: > > > > - # Since clones can be bare, use --absolute-git-dir > instead of --show-toplevel > > > > - output = runfetchcmd("LANG=C %s rev-parse > --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > > > - > > > > - toplevel = os.path.abspath(output.rstrip()) > > > > - abs_clonedir = > os.path.abspath(ud.clonedir).rstrip('/') > > > > - # The top level Git directory must either be the > clone directory > > > > - # or a child of the clone directory. Any ancestor > directory of > > > > - # the clone directory is not valid as the Git > directory (and > > > > - # probably belongs to some other unrelated > repository), so a > > > > - # clone is required > > > > - if os.path.commonprefix([abs_clonedir, toplevel]) > != abs_clonedir: > > > > - logger.warning("Top level directory '%s' > doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > > > > - needs_clone = True > > > > + runfetchcmd("LANG=C %s rev-parse > --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > > > except bb.fetch2.FetchError as e: > > > > logger.warning("Unable to get top level for %s (not > a git directory?): %s", ud.clonedir, e) > > > > needs_clone = True > > > > > > Doesn't this give rise to some of the original issues (it could find a > > > parent git repo instead of the one being cloned)? > > > > Yes, but this has not been a problem for me. Re-cloning existing fully > cached > > repositories is the effect I see. > > > > I think the git commit id and branch should decide if fetch or re-clone > steps are needed. > > The path of the repository should not matter. The git remotes also not. > Both of those can > > and do change. > > In that case you're effectively sending a revert and we should probably > be clear about that... > > We really need to find a way to resolve things for everyone. > > Cheers, > > Richard > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#15045): > https://lists.openembedded.org/g/bitbake-devel/message/15045 > Mute This Topic: https://lists.openembedded.org/mt/101333074/3617156 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: > https://lists.openembedded.org/g/bitbake-devel/leave/8021546/3617156/1661570722/xyzzy > [martin.jansa@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- > >
On Wed, Sep 13, 2023 at 3:42 AM Martin Jansa <martin.jansa@gmail.com> wrote: > > FWIW: I'm also seeing warnings like: > > WARNING: Top level directory '/data001/work/downloads/git2/git.libssh.org.projects.libssh.git' doesn't match expected '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'. Re-cloning > > Where /mnt/mirror-write/downloads/ is configured DL_DIR and /data001/work/downloads/ was probably the DL_DIR of the jenkins jobs which originally fetched the libssh and created the tarball with it for premirror. That's strange.... I'll take a look real quick and see whats going on > > Cheers, > > On Wed, Sep 13, 2023 at 11:22 AM Richard Purdie <richard.purdie@linuxfoundation.org> wrote: >> >> On Wed, 2023-09-13 at 12:04 +0300, Mikko Rapeli wrote: >> > Hi, >> > >> > On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: >> > > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: >> > > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 >> > > > triggers re-clones of uptodate repositories when their >> > > > access path changes. This is wrong. It is enough that >> > > > needed commits are found from the git repo. For example >> > > > when same DL_DIR is accessed using two different paths >> > > > due to a symlink, then the resulting absolute paths will >> > > > differ but the content is the same. Trust the content. >> > > > >> > > > Avoids extra downloads of large repositories like linux-yocto >> > > > for kernel when the local cache was already uptodate. >> > > > >> > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> >> > > > --- >> > > > bitbake/lib/bb/fetch2/git.py | 14 +------------- >> > > > 1 file changed, 1 insertion(+), 13 deletions(-) >> > > > >> > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py >> > > > index e11271b757..0c547ea303 100644 >> > > > --- a/bitbake/lib/bb/fetch2/git.py >> > > > +++ b/bitbake/lib/bb/fetch2/git.py >> > > > @@ -371,19 +371,7 @@ class Git(FetchMethod): >> > > > # The directory may exist, but not be the top level of a bare git >> > > > # repository in which case it needs to be deleted and re-cloned. >> > > > try: >> > > > - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel >> > > > - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> > > > - >> > > > - toplevel = os.path.abspath(output.rstrip()) >> > > > - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') >> > > > - # The top level Git directory must either be the clone directory >> > > > - # or a child of the clone directory. Any ancestor directory of >> > > > - # the clone directory is not valid as the Git directory (and >> > > > - # probably belongs to some other unrelated repository), so a >> > > > - # clone is required >> > > > - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: >> > > > - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) >> > > > - needs_clone = True >> > > > + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> > > > except bb.fetch2.FetchError as e: >> > > > logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) >> > > > needs_clone = True >> > > >> > > Doesn't this give rise to some of the original issues (it could find a >> > > parent git repo instead of the one being cloned)? >> > >> > Yes, but this has not been a problem for me. Re-cloning existing fully cached >> > repositories is the effect I see. >> > >> > I think the git commit id and branch should decide if fetch or re-clone steps are needed. >> > The path of the repository should not matter. The git remotes also not. Both of those can >> > and do change. >> >> In that case you're effectively sending a revert and we should probably >> be clear about that... >> >> We really need to find a way to resolve things for everyone. >> >> Cheers, >> >> Richard >> >> -=-=-=-=-=-=-=-=-=-=-=- >> Links: You receive all messages sent to this group. >> View/Reply Online (#15045): https://lists.openembedded.org/g/bitbake-devel/message/15045 >> Mute This Topic: https://lists.openembedded.org/mt/101333074/3617156 >> Group Owner: bitbake-devel+owner@lists.openembedded.org >> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/leave/8021546/3617156/1661570722/xyzzy [martin.jansa@gmail.com] >> -=-=-=-=-=-=-=-=-=-=-=- >>
On Wed, Sep 13, 2023 at 3:42 AM Martin Jansa <martin.jansa@gmail.com> wrote: > > FWIW: I'm also seeing warnings like: > > WARNING: Top level directory '/data001/work/downloads/git2/git.libssh.org.projects.libssh.git' doesn't match expected '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'. Re-cloning > > Where /mnt/mirror-write/downloads/ is configured DL_DIR and /data001/work/downloads/ was probably the DL_DIR of the jenkins jobs which originally fetched the libssh and created the tarball with it for premirror. Ya, that /data001 path is _really_ weird. That path comes from running `git rev-parse --absolute-git-dir` in the extracted directory (ud.clonedir, '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'). The should be no absolute paths encoded in the mirror tarballs, so this seems like it's some sort of misconfiguration(?), and not related to Mikkos problem. I think I have a fix for Mikkos symlink problem though > > Cheers, > > On Wed, Sep 13, 2023 at 11:22 AM Richard Purdie <richard.purdie@linuxfoundation.org> wrote: >> >> On Wed, 2023-09-13 at 12:04 +0300, Mikko Rapeli wrote: >> > Hi, >> > >> > On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: >> > > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: >> > > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 >> > > > triggers re-clones of uptodate repositories when their >> > > > access path changes. This is wrong. It is enough that >> > > > needed commits are found from the git repo. For example >> > > > when same DL_DIR is accessed using two different paths >> > > > due to a symlink, then the resulting absolute paths will >> > > > differ but the content is the same. Trust the content. >> > > > >> > > > Avoids extra downloads of large repositories like linux-yocto >> > > > for kernel when the local cache was already uptodate. >> > > > >> > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> >> > > > --- >> > > > bitbake/lib/bb/fetch2/git.py | 14 +------------- >> > > > 1 file changed, 1 insertion(+), 13 deletions(-) >> > > > >> > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py >> > > > index e11271b757..0c547ea303 100644 >> > > > --- a/bitbake/lib/bb/fetch2/git.py >> > > > +++ b/bitbake/lib/bb/fetch2/git.py >> > > > @@ -371,19 +371,7 @@ class Git(FetchMethod): >> > > > # The directory may exist, but not be the top level of a bare git >> > > > # repository in which case it needs to be deleted and re-cloned. >> > > > try: >> > > > - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel >> > > > - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> > > > - >> > > > - toplevel = os.path.abspath(output.rstrip()) >> > > > - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') >> > > > - # The top level Git directory must either be the clone directory >> > > > - # or a child of the clone directory. Any ancestor directory of >> > > > - # the clone directory is not valid as the Git directory (and >> > > > - # probably belongs to some other unrelated repository), so a >> > > > - # clone is required >> > > > - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: >> > > > - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) >> > > > - needs_clone = True >> > > > + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> > > > except bb.fetch2.FetchError as e: >> > > > logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) >> > > > needs_clone = True >> > > >> > > Doesn't this give rise to some of the original issues (it could find a >> > > parent git repo instead of the one being cloned)? >> > >> > Yes, but this has not been a problem for me. Re-cloning existing fully cached >> > repositories is the effect I see. >> > >> > I think the git commit id and branch should decide if fetch or re-clone steps are needed. >> > The path of the repository should not matter. The git remotes also not. Both of those can >> > and do change. >> >> In that case you're effectively sending a revert and we should probably >> be clear about that... >> >> We really need to find a way to resolve things for everyone. >> >> Cheers, >> >> Richard >> >> -=-=-=-=-=-=-=-=-=-=-=- >> Links: You receive all messages sent to this group. >> View/Reply Online (#15045): https://lists.openembedded.org/g/bitbake-devel/message/15045 >> Mute This Topic: https://lists.openembedded.org/mt/101333074/3617156 >> Group Owner: bitbake-devel+owner@lists.openembedded.org >> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/leave/8021546/3617156/1661570722/xyzzy [martin.jansa@gmail.com] >> -=-=-=-=-=-=-=-=-=-=-=- >>
> not related to Mikkos problem. > I think I have a fix for Mikkos symlink problem though Maybe it is related in the end, to my surprise /mnt/mirror-write/downloads is on some servers a symlink to /data001/work/downloads and then this /mnt/mirror-write/download is used as DL_DIR there (not /data001/work/downloads directly). On Wed, Sep 13, 2023 at 4:45 PM Joshua Watt <jpewhacker@gmail.com> wrote: > On Wed, Sep 13, 2023 at 3:42 AM Martin Jansa <martin.jansa@gmail.com> > wrote: > > > > FWIW: I'm also seeing warnings like: > > > > WARNING: Top level directory > '/data001/work/downloads/git2/git.libssh.org.projects.libssh.git' doesn't > match expected > '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'. > Re-cloning > > > > Where /mnt/mirror-write/downloads/ is configured DL_DIR and > /data001/work/downloads/ was probably the DL_DIR of the jenkins jobs which > originally fetched the libssh and created the tarball with it for premirror. > > Ya, that /data001 path is _really_ weird. That path comes from running > `git rev-parse --absolute-git-dir` in the extracted directory > (ud.clonedir, > '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'). > The should be no absolute paths encoded in the mirror tarballs, so > this seems like it's some sort of misconfiguration(?), and not related > to Mikkos problem. > > > I think I have a fix for Mikkos symlink problem though > > > > > Cheers, > > > > On Wed, Sep 13, 2023 at 11:22 AM Richard Purdie < > richard.purdie@linuxfoundation.org> wrote: > >> > >> On Wed, 2023-09-13 at 12:04 +0300, Mikko Rapeli wrote: > >> > Hi, > >> > > >> > On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: > >> > > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: > >> > > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 > >> > > > triggers re-clones of uptodate repositories when their > >> > > > access path changes. This is wrong. It is enough that > >> > > > needed commits are found from the git repo. For example > >> > > > when same DL_DIR is accessed using two different paths > >> > > > due to a symlink, then the resulting absolute paths will > >> > > > differ but the content is the same. Trust the content. > >> > > > > >> > > > Avoids extra downloads of large repositories like linux-yocto > >> > > > for kernel when the local cache was already uptodate. > >> > > > > >> > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > >> > > > --- > >> > > > bitbake/lib/bb/fetch2/git.py | 14 +------------- > >> > > > 1 file changed, 1 insertion(+), 13 deletions(-) > >> > > > > >> > > > diff --git a/bitbake/lib/bb/fetch2/git.py > b/bitbake/lib/bb/fetch2/git.py > >> > > > index e11271b757..0c547ea303 100644 > >> > > > --- a/bitbake/lib/bb/fetch2/git.py > >> > > > +++ b/bitbake/lib/bb/fetch2/git.py > >> > > > @@ -371,19 +371,7 @@ class Git(FetchMethod): > >> > > > # The directory may exist, but not be the top level > of a bare git > >> > > > # repository in which case it needs to be deleted > and re-cloned. > >> > > > try: > >> > > > - # Since clones can be bare, use > --absolute-git-dir instead of --show-toplevel > >> > > > - output = runfetchcmd("LANG=C %s rev-parse > --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > >> > > > - > >> > > > - toplevel = os.path.abspath(output.rstrip()) > >> > > > - abs_clonedir = > os.path.abspath(ud.clonedir).rstrip('/') > >> > > > - # The top level Git directory must either be the > clone directory > >> > > > - # or a child of the clone directory. Any > ancestor directory of > >> > > > - # the clone directory is not valid as the Git > directory (and > >> > > > - # probably belongs to some other unrelated > repository), so a > >> > > > - # clone is required > >> > > > - if os.path.commonprefix([abs_clonedir, > toplevel]) != abs_clonedir: > >> > > > - logger.warning("Top level directory '%s' > doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > >> > > > - needs_clone = True > >> > > > + runfetchcmd("LANG=C %s rev-parse > --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > >> > > > except bb.fetch2.FetchError as e: > >> > > > logger.warning("Unable to get top level for %s > (not a git directory?): %s", ud.clonedir, e) > >> > > > needs_clone = True > >> > > > >> > > Doesn't this give rise to some of the original issues (it could > find a > >> > > parent git repo instead of the one being cloned)? > >> > > >> > Yes, but this has not been a problem for me. Re-cloning existing > fully cached > >> > repositories is the effect I see. > >> > > >> > I think the git commit id and branch should decide if fetch or > re-clone steps are needed. > >> > The path of the repository should not matter. The git remotes also > not. Both of those can > >> > and do change. > >> > >> In that case you're effectively sending a revert and we should probably > >> be clear about that... > >> > >> We really need to find a way to resolve things for everyone. > >> > >> Cheers, > >> > >> Richard > >> > >> -=-=-=-=-=-=-=-=-=-=-=- > >> Links: You receive all messages sent to this group. > >> View/Reply Online (#15045): > https://lists.openembedded.org/g/bitbake-devel/message/15045 > >> Mute This Topic: https://lists.openembedded.org/mt/101333074/3617156 > >> Group Owner: bitbake-devel+owner@lists.openembedded.org > >> Unsubscribe: > https://lists.openembedded.org/g/bitbake-devel/leave/8021546/3617156/1661570722/xyzzy > [martin.jansa@gmail.com] > >> -=-=-=-=-=-=-=-=-=-=-=- > >> >
On Wed, Sep 13, 2023 at 8:53 AM Martin Jansa <martin.jansa@gmail.com> wrote: > > > not related to Mikkos problem. > > I think I have a fix for Mikkos symlink problem though > > Maybe it is related in the end, to my surprise /mnt/mirror-write/downloads is on some servers a symlink to /data001/work/downloads and then this /mnt/mirror-write/download is used as DL_DIR there (not /data001/work/downloads directly). Ah, OK. That is the same problem as Mikko. I was assuming you were seeing that locally where the two aren't symlinked together. That is a much simpler explanation :) > > On Wed, Sep 13, 2023 at 4:45 PM Joshua Watt <jpewhacker@gmail.com> wrote: >> >> On Wed, Sep 13, 2023 at 3:42 AM Martin Jansa <martin.jansa@gmail.com> wrote: >> > >> > FWIW: I'm also seeing warnings like: >> > >> > WARNING: Top level directory '/data001/work/downloads/git2/git.libssh.org.projects.libssh.git' doesn't match expected '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'. Re-cloning >> > >> > Where /mnt/mirror-write/downloads/ is configured DL_DIR and /data001/work/downloads/ was probably the DL_DIR of the jenkins jobs which originally fetched the libssh and created the tarball with it for premirror. >> >> Ya, that /data001 path is _really_ weird. That path comes from running >> `git rev-parse --absolute-git-dir` in the extracted directory >> (ud.clonedir, '/mnt/mirror-write/downloads/git2/git.libssh.org.projects.libssh.git'). >> The should be no absolute paths encoded in the mirror tarballs, so >> this seems like it's some sort of misconfiguration(?), and not related >> to Mikkos problem. >> >> >> I think I have a fix for Mikkos symlink problem though >> >> > >> > Cheers, >> > >> > On Wed, Sep 13, 2023 at 11:22 AM Richard Purdie <richard.purdie@linuxfoundation.org> wrote: >> >> >> >> On Wed, 2023-09-13 at 12:04 +0300, Mikko Rapeli wrote: >> >> > Hi, >> >> > >> >> > On Wed, Sep 13, 2023 at 09:51:39AM +0100, Richard Purdie wrote: >> >> > > On Wed, 2023-09-13 at 11:43 +0300, Mikko Rapeli wrote: >> >> > > > Commit 10f38157a069cb6938323cadd5e523037a29ace9 >> >> > > > triggers re-clones of uptodate repositories when their >> >> > > > access path changes. This is wrong. It is enough that >> >> > > > needed commits are found from the git repo. For example >> >> > > > when same DL_DIR is accessed using two different paths >> >> > > > due to a symlink, then the resulting absolute paths will >> >> > > > differ but the content is the same. Trust the content. >> >> > > > >> >> > > > Avoids extra downloads of large repositories like linux-yocto >> >> > > > for kernel when the local cache was already uptodate. >> >> > > > >> >> > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> >> >> > > > --- >> >> > > > bitbake/lib/bb/fetch2/git.py | 14 +------------- >> >> > > > 1 file changed, 1 insertion(+), 13 deletions(-) >> >> > > > >> >> > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py >> >> > > > index e11271b757..0c547ea303 100644 >> >> > > > --- a/bitbake/lib/bb/fetch2/git.py >> >> > > > +++ b/bitbake/lib/bb/fetch2/git.py >> >> > > > @@ -371,19 +371,7 @@ class Git(FetchMethod): >> >> > > > # The directory may exist, but not be the top level of a bare git >> >> > > > # repository in which case it needs to be deleted and re-cloned. >> >> > > > try: >> >> > > > - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel >> >> > > > - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> >> > > > - >> >> > > > - toplevel = os.path.abspath(output.rstrip()) >> >> > > > - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') >> >> > > > - # The top level Git directory must either be the clone directory >> >> > > > - # or a child of the clone directory. Any ancestor directory of >> >> > > > - # the clone directory is not valid as the Git directory (and >> >> > > > - # probably belongs to some other unrelated repository), so a >> >> > > > - # clone is required >> >> > > > - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: >> >> > > > - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) >> >> > > > - needs_clone = True >> >> > > > + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> >> > > > except bb.fetch2.FetchError as e: >> >> > > > logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) >> >> > > > needs_clone = True >> >> > > >> >> > > Doesn't this give rise to some of the original issues (it could find a >> >> > > parent git repo instead of the one being cloned)? >> >> > >> >> > Yes, but this has not been a problem for me. Re-cloning existing fully cached >> >> > repositories is the effect I see. >> >> > >> >> > I think the git commit id and branch should decide if fetch or re-clone steps are needed. >> >> > The path of the repository should not matter. The git remotes also not. Both of those can >> >> > and do change. >> >> >> >> In that case you're effectively sending a revert and we should probably >> >> be clear about that... >> >> >> >> We really need to find a way to resolve things for everyone. >> >> >> >> Cheers, >> >> >> >> Richard >> >> >> >> -=-=-=-=-=-=-=-=-=-=-=- >> >> Links: You receive all messages sent to this group. >> >> View/Reply Online (#15045): https://lists.openembedded.org/g/bitbake-devel/message/15045 >> >> Mute This Topic: https://lists.openembedded.org/mt/101333074/3617156 >> >> Group Owner: bitbake-devel+owner@lists.openembedded.org >> >> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/leave/8021546/3617156/1661570722/xyzzy [martin.jansa@gmail.com] >> >> -=-=-=-=-=-=-=-=-=-=-=- >> >>
diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py index e11271b757..0c547ea303 100644 --- a/bitbake/lib/bb/fetch2/git.py +++ b/bitbake/lib/bb/fetch2/git.py @@ -371,19 +371,7 @@ class Git(FetchMethod): # The directory may exist, but not be the top level of a bare git # repository in which case it needs to be deleted and re-cloned. try: - # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel - output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) - - toplevel = os.path.abspath(output.rstrip()) - abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') - # The top level Git directory must either be the clone directory - # or a child of the clone directory. Any ancestor directory of - # the clone directory is not valid as the Git directory (and - # probably belongs to some other unrelated repository), so a - # clone is required - if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: - logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) - needs_clone = True + runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) except bb.fetch2.FetchError as e: logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) needs_clone = True
Commit 10f38157a069cb6938323cadd5e523037a29ace9 triggers re-clones of uptodate repositories when their access path changes. This is wrong. It is enough that needed commits are found from the git repo. For example when same DL_DIR is accessed using two different paths due to a symlink, then the resulting absolute paths will differ but the content is the same. Trust the content. Avoids extra downloads of large repositories like linux-yocto for kernel when the local cache was already uptodate. Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> --- bitbake/lib/bb/fetch2/git.py | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-)