[v3,1/3] Honour BB_FETCH_PREMIRRORONLY option

Message ID 20220506141733.3298-1-pavel@zhukoff.net
State New
Headers show
Series [v3,1/3] Honour BB_FETCH_PREMIRRORONLY option | expand

Commit Message

Pavel Zhukov May 6, 2022, 2:17 p.m. UTC
This should fix [Yocto 13353] and related to [Yocto 13233] as well.
Previously if git repo mirror has been updated in between of two builds
fetcher of the second build didn't try updated mirror but switched to
git clone from upstream instead. This is problem for offline builds.
Fix this to raise MirrorException if BB_FETCH_PREMIRRORONLY has been
specified by the mirror doesn't contain SRC_REV.

Signed-off-by: Pavel Zhukov <pavel.zhukov@huawei.com>
---
 lib/bb/fetch2/__init__.py | 2 ++
 lib/bb/fetch2/git.py      | 6 ++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Richard Purdie May 12, 2022, 9:21 p.m. UTC | #1
On Fri, 2022-05-06 at 16:17 +0200, Pavel Zhukov wrote:
> This should fix [Yocto 13353] and related to [Yocto 13233] as well.
> Previously if git repo mirror has been updated in between of two builds
> fetcher of the second build didn't try updated mirror but switched to
> git clone from upstream instead. This is problem for offline builds.
> Fix this to raise MirrorException if BB_FETCH_PREMIRRORONLY has been
> specified by the mirror doesn't contain SRC_REV.
> 
> Signed-off-by: Pavel Zhukov <pavel.zhukov@huawei.com>
> ---
>  lib/bb/fetch2/__init__.py | 2 ++
>  lib/bb/fetch2/git.py      | 6 ++++--
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
> index ac557176..c99cd104 100644
> --- a/lib/bb/fetch2/__init__.py
> +++ b/lib/bb/fetch2/__init__.py
> @@ -1745,6 +1745,8 @@ class Fetch(object):
>                              done = False
>  
>                  if premirroronly:
> +                    if not done:
> +                        raise FetchError("Failed to download premirror {} and BB_FETCH_PREMIRRORONLY has been set to 1.".format(self.d.getVar("PREMIRRORS")))
>                      self.d.setVar("BB_NO_NETWORK", "1")
>  
>                  firsterr = None
> diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
> index bdcfa497..e556f297 100644
> --- a/lib/bb/fetch2/git.py
> +++ b/lib/bb/fetch2/git.py
> @@ -353,10 +353,12 @@ class Git(FetchMethod):
>          if ud.shallow and os.path.exists(ud.fullshallow) and self.need_update(ud, d):
>              ud.localpath = ud.fullshallow
>              return
> -        elif os.path.exists(ud.fullmirror) and not os.path.exists(ud.clonedir):
> +        if os.path.exists(ud.fullmirror) and self.try_premirror(ud, d) and self.need_update(ud, d):
>              bb.utils.mkdirhier(ud.clonedir)
>              runfetchcmd("tar -xzf %s" % ud.fullmirror, d, workdir=ud.clonedir)
> -
> +            if self.need_update(ud, d) and bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
> +                    raise bb.fetch2.FetchError("Premirror doesn't contain revisions {} and "\
> +                            "upstream cannot be used due to BB_FETCH_PREMIRRORONLY setting".format(ud.revisions))
>          repourl = self._get_repo_url(ud)
> 

Thanks for this series, it is a long standing issue we need to fix.
Sorry for the delayed reply, I've had to think about this for while. I
think your patch is mostly correct. What I'm worried about is the
fetchers having premirror knowledge, that was intended to say inside
__init__.py in the fetcher.

I think there is a better way for the git fetcher to behave. It tries
hard to preserve clonedir if it already exists but in the premirror
case it will remove it.

This does lead me to one concern with the patch as it stands which is
that if clonedir already exists and we untar ud.fullmirror over the top
of it, the mix of existing and new files may be toxic (e.g. expanded
refs and packed-refs being mixed). At the very least we need a delete
in there.

I think what it should do is untar ud.fullmirror to a temp directory,
then replace clonedir if this looks "better" than clonedir, else if
just deletes the temp directory.

I think that way we can remove a lot of the twisted logic and simplify
things. Would you agree and is that something you could look at?

Cheers,

Richard
Richard Purdie May 12, 2022, 10:04 p.m. UTC | #2
On Thu, 2022-05-12 at 22:21 +0100, Richard Purdie via
lists.openembedded.org wrote:
> On Fri, 2022-05-06 at 16:17 +0200, Pavel Zhukov wrote:
> > This should fix [Yocto 13353] and related to [Yocto 13233] as well.
> > Previously if git repo mirror has been updated in between of two builds
> > fetcher of the second build didn't try updated mirror but switched to
> > git clone from upstream instead. This is problem for offline builds.
> > Fix this to raise MirrorException if BB_FETCH_PREMIRRORONLY has been
> > specified by the mirror doesn't contain SRC_REV.
> > 
> > Signed-off-by: Pavel Zhukov <pavel.zhukov@huawei.com>
> > ---
> >  lib/bb/fetch2/__init__.py | 2 ++
> >  lib/bb/fetch2/git.py      | 6 ++++--
> >  2 files changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
> > index ac557176..c99cd104 100644
> > --- a/lib/bb/fetch2/__init__.py
> > +++ b/lib/bb/fetch2/__init__.py
> > @@ -1745,6 +1745,8 @@ class Fetch(object):
> >                              done = False
> >  
> >                  if premirroronly:
> > +                    if not done:
> > +                        raise FetchError("Failed to download premirror {} and BB_FETCH_PREMIRRORONLY has been set to 1.".format(self.d.getVar("PREMIRRORS")))
> >                      self.d.setVar("BB_NO_NETWORK", "1")
> >  
> >                  firsterr = None
> > diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
> > index bdcfa497..e556f297 100644
> > --- a/lib/bb/fetch2/git.py
> > +++ b/lib/bb/fetch2/git.py
> > @@ -353,10 +353,12 @@ class Git(FetchMethod):
> >          if ud.shallow and os.path.exists(ud.fullshallow) and self.need_update(ud, d):
> >              ud.localpath = ud.fullshallow
> >              return
> > -        elif os.path.exists(ud.fullmirror) and not os.path.exists(ud.clonedir):
> > +        if os.path.exists(ud.fullmirror) and self.try_premirror(ud, d) and self.need_update(ud, d):
> >              bb.utils.mkdirhier(ud.clonedir)
> >              runfetchcmd("tar -xzf %s" % ud.fullmirror, d, workdir=ud.clonedir)
> > -
> > +            if self.need_update(ud, d) and bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
> > +                    raise bb.fetch2.FetchError("Premirror doesn't contain revisions {} and "\
> > +                            "upstream cannot be used due to BB_FETCH_PREMIRRORONLY setting".format(ud.revisions))
> >          repourl = self._get_repo_url(ud)
> > 
> 
> Thanks for this series, it is a long standing issue we need to fix.
> Sorry for the delayed reply, I've had to think about this for while. I
> think your patch is mostly correct. What I'm worried about is the
> fetchers having premirror knowledge, that was intended to say inside
> __init__.py in the fetcher.
> 
> I think there is a better way for the git fetcher to behave. It tries
> hard to preserve clonedir if it already exists but in the premirror
> case it will remove it.
> 
> This does lead me to one concern with the patch as it stands which is
> that if clonedir already exists and we untar ud.fullmirror over the top
> of it, the mix of existing and new files may be toxic (e.g. expanded
> refs and packed-refs being mixed). At the very least we need a delete
> in there.
> 
> I think what it should do is untar ud.fullmirror to a temp directory,
> then replace clonedir if this looks "better" than clonedir, else if
> just deletes the temp directory.
> 
> I think that way we can remove a lot of the twisted logic and simplify
> things. Would you agree and is that something you could look at?

I realised I wasn't clear. I've never liked deleting clonedir since
this has the potential to lose data. The pathalogical case is where we
screw up a revision of linux-yocto, it gets deleted as bitbake can't
find the revision and then the user has to download the whole thing
again.

I mentioned deletion above but I've realised we could clone into
clonedir from the temp dir, thereby removing the risk other revisions
were removed from the repo too. That would seem to be the best way to
handle this to mitigate the most issues.

Cheers,

Richard
Pavel Zhukov May 16, 2022, 10:55 a.m. UTC | #3
"Richard Purdie" <richard.purdie@linuxfoundation.org> writes:

> On Thu, 2022-05-12 at 22:21 +0100, Richard Purdie via
> lists.openembedded.org wrote:
>> On Fri, 2022-05-06 at 16:17 +0200, Pavel Zhukov wrote:
>> > This should fix [Yocto 13353] and related to [Yocto 13233] as well.
>> > Previously if git repo mirror has been updated in between of two builds
>> > fetcher of the second build didn't try updated mirror but switched to
>> > git clone from upstream instead. This is problem for offline builds.
>> > Fix this to raise MirrorException if BB_FETCH_PREMIRRORONLY has been
>> > specified by the mirror doesn't contain SRC_REV.
>> > 
>> > Signed-off-by: Pavel Zhukov <pavel.zhukov@huawei.com>
>> > ---
>> >  lib/bb/fetch2/__init__.py | 2 ++
>> >  lib/bb/fetch2/git.py      | 6 ++++--
>> >  2 files changed, 6 insertions(+), 2 deletions(-)
>> > 
>> > diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
>> > index ac557176..c99cd104 100644
>> > --- a/lib/bb/fetch2/__init__.py
>> > +++ b/lib/bb/fetch2/__init__.py
>> > @@ -1745,6 +1745,8 @@ class Fetch(object):
>> >                              done = False
>> >  
>> >                  if premirroronly:
>> > +                    if not done:
>> > +                        raise FetchError("Failed to download premirror {} and BB_FETCH_PREMIRRORONLY has been set to 1.".format(self.d.getVar("PREMIRRORS")))
>> >                      self.d.setVar("BB_NO_NETWORK", "1")
>> >  
>> >                  firsterr = None
>> > diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
>> > index bdcfa497..e556f297 100644
>> > --- a/lib/bb/fetch2/git.py
>> > +++ b/lib/bb/fetch2/git.py
>> > @@ -353,10 +353,12 @@ class Git(FetchMethod):
>> >          if ud.shallow and os.path.exists(ud.fullshallow) and self.need_update(ud, d):
>> >              ud.localpath = ud.fullshallow
>> >              return
>> > -        elif os.path.exists(ud.fullmirror) and not os.path.exists(ud.clonedir):
>> > +        if os.path.exists(ud.fullmirror) and self.try_premirror(ud, d) and self.need_update(ud, d):
>> >              bb.utils.mkdirhier(ud.clonedir)
>> >              runfetchcmd("tar -xzf %s" % ud.fullmirror, d, workdir=ud.clonedir)
>> > -
>> > +            if self.need_update(ud, d) and bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
>> > +                    raise bb.fetch2.FetchError("Premirror doesn't contain revisions {} and "\
>> > +                            "upstream cannot be used due to BB_FETCH_PREMIRRORONLY setting".format(ud.revisions))
>> >          repourl = self._get_repo_url(ud)
>> > 
>> 
>> Thanks for this series, it is a long standing issue we need to fix.
>> Sorry for the delayed reply, I've had to think about this for while. I
>> think your patch is mostly correct. What I'm worried about is the
>> fetchers having premirror knowledge, that was intended to say inside
>> __init__.py in the fetcher.
>> 
>> I think there is a better way for the git fetcher to behave. It tries
>> hard to preserve clonedir if it already exists but in the premirror
>> case it will remove it.
>> 
>> This does lead me to one concern with the patch as it stands which is
>> that if clonedir already exists and we untar ud.fullmirror over the top
>> of it, the mix of existing and new files may be toxic (e.g. expanded
>> refs and packed-refs being mixed). At the very least we need a delete
>> in there.
>> 
>> I think what it should do is untar ud.fullmirror to a temp directory,
>> then replace clonedir if this looks "better" than clonedir, else if
>> just deletes the temp directory.
>> 
>> I think that way we can remove a lot of the twisted logic and simplify
>> things. Would you agree and is that something you could look at?
>
> I realised I wasn't clear. I've never liked deleting clonedir since
> this has the potential to lose data. The pathalogical case is where we
> screw up a revision of linux-yocto, it gets deleted as bitbake can't
> find the revision and then the user has to download the whole thing
> again.
>
> I mentioned deletion above but I've realised we could clone into
> clonedir from the temp dir, thereby removing the risk other revisions
> were removed from the repo too. That would seem to be the best way to
> handle this to mitigate the most issues.
>
> Cheers,
>
> Richard
>
Thank you for the review. While addressing review comments I've found
one more bug related to this patch which can be fixed by fetching from
the fullmirror instead of overwriting it. Looking into it.

-- Pavel
>
>
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#13700): https://lists.openembedded.org/g/bitbake-devel/message/13700
> Mute This Topic: https://lists.openembedded.org/mt/90934195/6390638
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [pavel@zhukoff.net]
> -=-=-=-=-=-=-=-=-=-=-=-

Patch

diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
index ac557176..c99cd104 100644
--- a/lib/bb/fetch2/__init__.py
+++ b/lib/bb/fetch2/__init__.py
@@ -1745,6 +1745,8 @@  class Fetch(object):
                             done = False
 
                 if premirroronly:
+                    if not done:
+                        raise FetchError("Failed to download premirror {} and BB_FETCH_PREMIRRORONLY has been set to 1.".format(self.d.getVar("PREMIRRORS")))
                     self.d.setVar("BB_NO_NETWORK", "1")
 
                 firsterr = None
diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index bdcfa497..e556f297 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -353,10 +353,12 @@  class Git(FetchMethod):
         if ud.shallow and os.path.exists(ud.fullshallow) and self.need_update(ud, d):
             ud.localpath = ud.fullshallow
             return
-        elif os.path.exists(ud.fullmirror) and not os.path.exists(ud.clonedir):
+        if os.path.exists(ud.fullmirror) and self.try_premirror(ud, d) and self.need_update(ud, d):
             bb.utils.mkdirhier(ud.clonedir)
             runfetchcmd("tar -xzf %s" % ud.fullmirror, d, workdir=ud.clonedir)
-
+            if self.need_update(ud, d) and bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
+                    raise bb.fetch2.FetchError("Premirror doesn't contain revisions {} and "\
+                            "upstream cannot be used due to BB_FETCH_PREMIRRORONLY setting".format(ud.revisions))
         repourl = self._get_repo_url(ud)
 
         # If the repo still doesn't exist, fallback to cloning it