diff mbox series

[bitbake-devel] fetch2/git.py: fix the try_premirror logic

Message ID 20240130111158.1077626-1-Qi.Chen@windriver.com
State New
Headers show
Series [bitbake-devel] fetch2/git.py: fix the try_premirror logic | expand

Commit Message

Chen, Qi Jan. 30, 2024, 11:11 a.m. UTC
From: Chen Qi <Qi.Chen@windriver.com>

For gitsm recipes, it's possible that some URL is used more than once.
e.g.,
A -> B:rev1 (B is a submodule of A)
A -> C (C is a submodule of A)
C -> B:rev2 (B is a submodule of C)
A anc C are both using B as submodules, but on different revs.
Now if we have:
B:rev1 -> D
B:rev2 -> E
Then, the mirror will not be fully used.
Say we have all repo mirrors for A, B, C, D, E, then in theory it's not
necessary to reach out to any network for downloading. But it's not the
case. After downloading B(rev1) and its submodule D from mirrors, the fetch
process continues to download C, thus B(rev2) and E. Now it finds that B
needs an update because its submodule E needs an update. Of course this is
true because E is not downloaded yet. Now the problem comes to whether to
use mirror or not. The git.py defines try_premirror to return 'False' when
the ud.clonedir exists. As B has been cloned, the ud.clonedir exists and
try_mirror returns False, resulting in not using mirror and going to upstream
directly.

We can see that the mirrors are not fully used. This is usually not problem,
as the cost is only some network download. But in case the following two
settings are there, we get errors.
BB_NO_NETWORK = "0"
BB_ALLOWED_NETWORKS = "*.some.allowed.domain"
In such case, the gitsm recipe A will fail to fetch. Note that all contents
that A needs are in mirrors and now it's failing to fetch. This is unexpected.

Note that the different revs of the same repo in gitsm recipe is not the only
way to reveal this problem. For example, there might be a recipe call B that
uses B:rev3. Check the protobuf and grpc recipes as an example.

For now, we can use the following steps to reproduce this issue. To be clear,
the grpc recipe in meta-oe is now 1.60.0.
1. Add in local.conf:
   DL_DIR = "${TOPDIR}/downloads-premirror"
   bitbake grpc -c fetch
2. Comment out the DL_DIR setting in local.conf and add the following lines:
   PREMIRRORS:append = " \
     git://.*/.* git://${TOPDIR}/downloads-premirror/git2/MIRRORNAME;protocol=file \n \
     gitsm://.*/.* gitsm://${TOPDIR}/downloads-premirror/git2/MIRRORNAME;protocol=file \n \
   "
3. Set BB_NO_NETWORK = "1" and then 'bitbake grpc -c fetch'.
   This command succeeds and this shows that the premirror holds everything we need.
4. Add the following lines and then 'bitbake grpc -c fetch'.
   BB_NO_NETWORK = "0"
   BB_ALLOWED_NETWORKS = "*.some.domain"

After step 4, the error message is as below:
ERROR: grpc-1.60.0-r0 do_fetch: The URL: 'gitsm://github.com/protocolbuffers/protobuf.git;protocol=https;name=third_party/protobuf;subpath=third_party/protobuf;nobranch=1;lfs=True;bareclone=1;nobranch=1' is not trusted and cannot be used

I guess that the try_premirror codes in git fetcher (fetch2/git.py) are assuming
that the clonedir syncs with the mirror repo. That's only reason I can think of
for why it returns 'False' if the clonedir exists. The assumption is wrong. The
mirror could well be updated via automation tools or some project-level setup
tools. Yet, it's returning True when BB_FETCH_PREMIRRORONLY is set, regardless
of whether the clonedir exists. This means it does know that the mirror could be
updated. It seems that it treats using mirror as some kind of 'best effort'. Anyway,
the codes' logic is just contradicting with itself.

If the mirror can be possibly updated, then it should be tried regardless of whether
the clonedir exists. So we should remove the try_premirror codes here and use the
default one (in __init__.py), which always returns True.

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
---
 lib/bb/fetch2/git.py | 9 ---------
 1 file changed, 9 deletions(-)

Comments

Richard Purdie Jan. 30, 2024, 2:56 p.m. UTC | #1
On Tue, 2024-01-30 at 19:11 +0800, Chen Qi via lists.openembedded.org
wrote:
> From: Chen Qi <Qi.Chen@windriver.com>
> 
> For gitsm recipes, it's possible that some URL is used more than once.
> e.g.,
> A -> B:rev1 (B is a submodule of A)
> A -> C (C is a submodule of A)
> C -> B:rev2 (B is a submodule of C)
> A anc C are both using B as submodules, but on different revs.
> Now if we have:
> B:rev1 -> D
> B:rev2 -> E
> Then, the mirror will not be fully used.
> Say we have all repo mirrors for A, B, C, D, E, then in theory it's not
> necessary to reach out to any network for downloading. But it's not the
> case. After downloading B(rev1) and its submodule D from mirrors, the fetch
> process continues to download C, thus B(rev2) and E. Now it finds that B
> needs an update because its submodule E needs an update. Of course this is
> true because E is not downloaded yet. Now the problem comes to whether to
> use mirror or not. The git.py defines try_premirror to return 'False' when
> the ud.clonedir exists. As B has been cloned, the ud.clonedir exists and
> try_mirror returns False, resulting in not using mirror and going to upstream
> directly.
> 
> We can see that the mirrors are not fully used. This is usually not problem,
> as the cost is only some network download. But in case the following two
> settings are there, we get errors.
> BB_NO_NETWORK = "0"
> BB_ALLOWED_NETWORKS = "*.some.allowed.domain"
> In such case, the gitsm recipe A will fail to fetch. Note that all contents
> that A needs are in mirrors and now it's failing to fetch. This is unexpected.
> 
> Note that the different revs of the same repo in gitsm recipe is not the only
> way to reveal this problem. For example, there might be a recipe call B that
> uses B:rev3. Check the protobuf and grpc recipes as an example.
> 
> For now, we can use the following steps to reproduce this issue. To be clear,
> the grpc recipe in meta-oe is now 1.60.0.
> 1. Add in local.conf:
>    DL_DIR = "${TOPDIR}/downloads-premirror"
>    bitbake grpc -c fetch
> 2. Comment out the DL_DIR setting in local.conf and add the following lines:
>    PREMIRRORS:append = " \
>      git://.*/.* git://${TOPDIR}/downloads-premirror/git2/MIRRORNAME;protocol=file \n \
>      gitsm://.*/.* gitsm://${TOPDIR}/downloads-premirror/git2/MIRRORNAME;protocol=file \n \
>    "
> 3. Set BB_NO_NETWORK = "1" and then 'bitbake grpc -c fetch'.
>    This command succeeds and this shows that the premirror holds everything we need.
> 4. Add the following lines and then 'bitbake grpc -c fetch'.
>    BB_NO_NETWORK = "0"
>    BB_ALLOWED_NETWORKS = "*.some.domain"
> 
> After step 4, the error message is as below:
> ERROR: grpc-1.60.0-r0 do_fetch: The URL: 'gitsm://github.com/protocolbuffers/protobuf.git;protocol=https;name=third_party/protobuf;subpath=third_party/protobuf;nobranch=1;lfs=True;bareclone=1;nobranch=1' is not trusted and cannot be used
> 
> I guess that the try_premirror codes in git fetcher (fetch2/git.py) are assuming
> that the clonedir syncs with the mirror repo. That's only reason I can think of
> for why it returns 'False' if the clonedir exists. The assumption is wrong. The
> mirror could well be updated via automation tools or some project-level setup
> tools. Yet, it's returning True when BB_FETCH_PREMIRRORONLY is set, regardless
> of whether the clonedir exists. This means it does know that the mirror could be
> updated. It seems that it treats using mirror as some kind of 'best effort'. Anyway,
> the codes' logic is just contradicting with itself.
> 
> If the mirror can be possibly updated, then it should be tried regardless of whether
> the clonedir exists. So we should remove the try_premirror codes here and use the
> default one (in __init__.py), which always returns True.
> 
> Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
> ---
>  lib/bb/fetch2/git.py | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
> index 7aa3fd9e5..62618012b 100644
> --- a/lib/bb/fetch2/git.py
> +++ b/lib/bb/fetch2/git.py
> @@ -353,15 +353,6 @@ class Git(FetchMethod):
>      def tarball_need_update(self, ud):
>          return ud.write_tarballs and not os.path.exists(ud.fullmirror)
>  
> -    def try_premirror(self, ud, d):
> -        # If we don't do this, updating an existing checkout with only premirrors
> -        # is not possible
> -        if bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
> -            return True
> -        if os.path.exists(ud.clonedir):
> -            return False
> -        return True
> -
>      def download(self, ud, d):
>          """Fetch url"""

I'm not sure this is a good idea. The problem is that the default
fetcher would not try and incrementally fetch into an existing clone
without this code and that is a common workflow.

Did the existing test cases pass after this change?

We probably should have a test case for this issue too. Things have
changes now mirrors/premirrors are non-destructive to existing git
repos so we may be able to improve the core logic to fix this corner
case but it can't be at the expense of some of the other workflows.

Cheers,

Richard
Chen, Qi Jan. 31, 2024, 3:41 a.m. UTC | #2
On 1/30/24 22:56, Richard Purdie wrote:
> On Tue, 2024-01-30 at 19:11 +0800, Chen Qi via lists.openembedded.org
> wrote:
>> From: Chen Qi <Qi.Chen@windriver.com>
>>
>> For gitsm recipes, it's possible that some URL is used more than once.
>> e.g.,
>> A -> B:rev1 (B is a submodule of A)
>> A -> C (C is a submodule of A)
>> C -> B:rev2 (B is a submodule of C)
>> A anc C are both using B as submodules, but on different revs.
>> Now if we have:
>> B:rev1 -> D
>> B:rev2 -> E
>> Then, the mirror will not be fully used.
>> Say we have all repo mirrors for A, B, C, D, E, then in theory it's not
>> necessary to reach out to any network for downloading. But it's not the
>> case. After downloading B(rev1) and its submodule D from mirrors, the fetch
>> process continues to download C, thus B(rev2) and E. Now it finds that B
>> needs an update because its submodule E needs an update. Of course this is
>> true because E is not downloaded yet. Now the problem comes to whether to
>> use mirror or not. The git.py defines try_premirror to return 'False' when
>> the ud.clonedir exists. As B has been cloned, the ud.clonedir exists and
>> try_mirror returns False, resulting in not using mirror and going to upstream
>> directly.
>>
>> We can see that the mirrors are not fully used. This is usually not problem,
>> as the cost is only some network download. But in case the following two
>> settings are there, we get errors.
>> BB_NO_NETWORK = "0"
>> BB_ALLOWED_NETWORKS = "*.some.allowed.domain"
>> In such case, the gitsm recipe A will fail to fetch. Note that all contents
>> that A needs are in mirrors and now it's failing to fetch. This is unexpected.
>>
>> Note that the different revs of the same repo in gitsm recipe is not the only
>> way to reveal this problem. For example, there might be a recipe call B that
>> uses B:rev3. Check the protobuf and grpc recipes as an example.
>>
>> For now, we can use the following steps to reproduce this issue. To be clear,
>> the grpc recipe in meta-oe is now 1.60.0.
>> 1. Add in local.conf:
>>     DL_DIR = "${TOPDIR}/downloads-premirror"
>>     bitbake grpc -c fetch
>> 2. Comment out the DL_DIR setting in local.conf and add the following lines:
>>     PREMIRRORS:append = " \
>>       git://.*/.* git://${TOPDIR}/downloads-premirror/git2/MIRRORNAME;protocol=file \n \
>>       gitsm://.*/.* gitsm://${TOPDIR}/downloads-premirror/git2/MIRRORNAME;protocol=file \n \
>>     "
>> 3. Set BB_NO_NETWORK = "1" and then 'bitbake grpc -c fetch'.
>>     This command succeeds and this shows that the premirror holds everything we need.
>> 4. Add the following lines and then 'bitbake grpc -c fetch'.
>>     BB_NO_NETWORK = "0"
>>     BB_ALLOWED_NETWORKS = "*.some.domain"
>>
>> After step 4, the error message is as below:
>> ERROR: grpc-1.60.0-r0 do_fetch: The URL: 'gitsm://github.com/protocolbuffers/protobuf.git;protocol=https;name=third_party/protobuf;subpath=third_party/protobuf;nobranch=1;lfs=True;bareclone=1;nobranch=1' is not trusted and cannot be used
>>
>> I guess that the try_premirror codes in git fetcher (fetch2/git.py) are assuming
>> that the clonedir syncs with the mirror repo. That's only reason I can think of
>> for why it returns 'False' if the clonedir exists. The assumption is wrong. The
>> mirror could well be updated via automation tools or some project-level setup
>> tools. Yet, it's returning True when BB_FETCH_PREMIRRORONLY is set, regardless
>> of whether the clonedir exists. This means it does know that the mirror could be
>> updated. It seems that it treats using mirror as some kind of 'best effort'. Anyway,
>> the codes' logic is just contradicting with itself.
>>
>> If the mirror can be possibly updated, then it should be tried regardless of whether
>> the clonedir exists. So we should remove the try_premirror codes here and use the
>> default one (in __init__.py), which always returns True.
>>
>> Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
>> ---
>>   lib/bb/fetch2/git.py | 9 ---------
>>   1 file changed, 9 deletions(-)
>>
>> diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
>> index 7aa3fd9e5..62618012b 100644
>> --- a/lib/bb/fetch2/git.py
>> +++ b/lib/bb/fetch2/git.py
>> @@ -353,15 +353,6 @@ class Git(FetchMethod):
>>       def tarball_need_update(self, ud):
>>           return ud.write_tarballs and not os.path.exists(ud.fullmirror)
>>   
>> -    def try_premirror(self, ud, d):
>> -        # If we don't do this, updating an existing checkout with only premirrors
>> -        # is not possible
>> -        if bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
>> -            return True
>> -        if os.path.exists(ud.clonedir):
>> -            return False
>> -        return True
>> -
>>       def download(self, ud, d):
>>           """Fetch url"""
> I'm not sure this is a good idea. The problem is that the default
> fetcher would not try and incrementally fetch into an existing clone
> without this code and that is a common workflow.

'Incremental fetch'. That's the reason! Thanks.

I'll come up with another solution which will not break this workflow.

> Did the existing test cases pass after this change?

Is the following command all I need to ensure no regression in bitbake 
fetcher?

bitbake-selftest bb.tests.fetch


>
> We probably should have a test case for this issue too.
Yes. I'll add one.
>   Things have
> changes now mirrors/premirrors are non-destructive to existing git
> repos so we may be able to improve the core logic to fix this corner
> case but it can't be at the expense of some of the other workflows.

Got it.

I just found a related bug here: 
https://bugzilla.yoctoproject.org/show_bug.cgi?id=15369

Though it's not the same problem, it's also related to try_premirror logic.

I'll add some comments in the try_premirror function to avoid confusion 
in the future. I'll also check to see if I can add a test case to cover 
this 'incremental fetch' workflow.

Regards,

Qi


>
> Cheers,
>
> Richard
diff mbox series

Patch

diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index 7aa3fd9e5..62618012b 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -353,15 +353,6 @@  class Git(FetchMethod):
     def tarball_need_update(self, ud):
         return ud.write_tarballs and not os.path.exists(ud.fullmirror)
 
-    def try_premirror(self, ud, d):
-        # If we don't do this, updating an existing checkout with only premirrors
-        # is not possible
-        if bb.utils.to_boolean(d.getVar("BB_FETCH_PREMIRRORONLY")):
-            return True
-        if os.path.exists(ud.clonedir):
-            return False
-        return True
-
     def download(self, ud, d):
         """Fetch url"""