base/patch: Disable network for unpack/patch/configure/compile/install

Message ID 20211222232035.1036830-1-richard.purdie@linuxfoundation.org
State New
Headers show
Series base/patch: Disable network for unpack/patch/configure/compile/install | expand

Commit Message

Richard Purdie Dec. 22, 2021, 11:20 p.m. UTC
Use the newly added nonetwork task flag to disable network access where
possible in unpack/patch/configure/compile/install tasks.

We can't disable networking in sstate tasks due to sstate downloads and
also so we can report hash equivalence to the server.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 meta/classes/base.bbclass  | 4 ++++
 meta/classes/patch.bbclass | 1 +
 2 files changed, 5 insertions(+)

Comments

Alexander Kanavin Dec. 23, 2021, 5:28 a.m. UTC | #1
Should there be tests for this? Would be good to check that the network is
indeed disabled in these tasks.

Alex

On Thu 23. Dec 2021 at 2.20, Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> Use the newly added nonetwork task flag to disable network access where
> possible in unpack/patch/configure/compile/install tasks.
>
> We can't disable networking in sstate tasks due to sstate downloads and
> also so we can report hash equivalence to the server.
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  meta/classes/base.bbclass  | 4 ++++
>  meta/classes/patch.bbclass | 1 +
>  2 files changed, 5 insertions(+)
>
> diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
> index b709777f243..e4c6c983b59 100644
> --- a/meta/classes/base.bbclass
> +++ b/meta/classes/base.bbclass
> @@ -214,6 +214,7 @@ python create_source_date_epoch_stamp() {
>      oe.reproducible.epochfile_write(source_date_epoch,
> d.getVar('SDE_FILE'), d)
>  }
>  do_unpack[postfuncs] += "create_source_date_epoch_stamp"
> +do_unpack[nonetwork] = "1"
>
>  def get_source_date_epoch_value(d):
>      return oe.reproducible.epochfile_read(d.getVar('SDE_FILE'), d)
> @@ -358,6 +359,7 @@ base_do_configure() {
>                 echo ${BB_TASKHASH} > ${CONFIGURESTAMPFILE}
>         fi
>  }
> +do_configure[nonetwork] = "1"
>
>  addtask compile after do_configure
>  do_compile[dirs] = "${B}"
> @@ -368,11 +370,13 @@ base_do_compile() {
>                 bbnote "nothing to compile"
>         fi
>  }
> +do_compile[nonetwork] = "1"
>
>  addtask install after do_compile
>  do_install[dirs] = "${B}"
>  # Remove and re-create ${D} so that is it guaranteed to be empty
>  do_install[cleandirs] = "${D}"
> +do_install[nonetwork] = "1"
>
>  base_do_install() {
>         :
> diff --git a/meta/classes/patch.bbclass b/meta/classes/patch.bbclass
> index 8de70254919..57aaf7c31d1 100644
> --- a/meta/classes/patch.bbclass
> +++ b/meta/classes/patch.bbclass
> @@ -164,6 +164,7 @@ patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
>
>  addtask patch after do_unpack
>  do_patch[dirs] = "${WORKDIR}"
> +do_patch[nonetwork] = "1"
>  do_patch[depends] = "${PATCHDEPENDENCY}"
>
>  EXPORT_FUNCTIONS do_patch
> --
> 2.32.0
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#159965):
> https://lists.openembedded.org/g/openembedded-core/message/159965
> Mute This Topic: https://lists.openembedded.org/mt/87909311/1686489
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
Peter Kjellerstedt Dec. 23, 2021, 10:49 a.m. UTC | #2
> -----Original Message-----
> From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
> Sent: den 23 december 2021 00:21
> To: openembedded-core@lists.openembedded.org
> Subject: [OE-core] [PATCH] base/patch: Disable network for unpack/patch/configure/compile/install
> 
> Use the newly added nonetwork task flag to disable network access where
> possible in unpack/patch/configure/compile/install tasks.
> 
> We can't disable networking in sstate tasks due to sstate downloads and
> also so we can report hash equivalence to the server.

Since no tasks except fetch (and apparently the sstate tasks) are expected 
to use the network, wouldn't it make more sense to reverse this flag? I.e., 
add do_fetch[network] = "1" instead. That way you don't get away with 
adding some random task and using the network from it unless you explicitly 
state that you will.

> 
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  meta/classes/base.bbclass  | 4 ++++
>  meta/classes/patch.bbclass | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
> index b709777f243..e4c6c983b59 100644
> --- a/meta/classes/base.bbclass
> +++ b/meta/classes/base.bbclass
> @@ -214,6 +214,7 @@ python create_source_date_epoch_stamp() {
>      oe.reproducible.epochfile_write(source_date_epoch, d.getVar('SDE_FILE'), d)
>  }
>  do_unpack[postfuncs] += "create_source_date_epoch_stamp"
> +do_unpack[nonetwork] = "1"
> 
>  def get_source_date_epoch_value(d):
>      return oe.reproducible.epochfile_read(d.getVar('SDE_FILE'), d)
> @@ -358,6 +359,7 @@ base_do_configure() {
>  		echo ${BB_TASKHASH} > ${CONFIGURESTAMPFILE}
>  	fi
>  }
> +do_configure[nonetwork] = "1"
> 
>  addtask compile after do_configure
>  do_compile[dirs] = "${B}"
> @@ -368,11 +370,13 @@ base_do_compile() {
>  		bbnote "nothing to compile"
>  	fi
>  }
> +do_compile[nonetwork] = "1"
> 
>  addtask install after do_compile
>  do_install[dirs] = "${B}"
>  # Remove and re-create ${D} so that is it guaranteed to be empty
>  do_install[cleandirs] = "${D}"
> +do_install[nonetwork] = "1"
> 
>  base_do_install() {
>  	:
> diff --git a/meta/classes/patch.bbclass b/meta/classes/patch.bbclass
> index 8de70254919..57aaf7c31d1 100644
> --- a/meta/classes/patch.bbclass
> +++ b/meta/classes/patch.bbclass
> @@ -164,6 +164,7 @@ patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
> 
>  addtask patch after do_unpack
>  do_patch[dirs] = "${WORKDIR}"
> +do_patch[nonetwork] = "1"
>  do_patch[depends] = "${PATCHDEPENDENCY}"
> 
>  EXPORT_FUNCTIONS do_patch
> --
> 2.32.0

//Peter
Konrad Weihmann Dec. 23, 2021, 11:31 a.m. UTC | #3
On 23.12.21 11:49, Peter Kjellerstedt wrote:
>> -----Original Message-----
>> From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
>> Sent: den 23 december 2021 00:21
>> To: openembedded-core@lists.openembedded.org
>> Subject: [OE-core] [PATCH] base/patch: Disable network for unpack/patch/configure/compile/install
>>
>> Use the newly added nonetwork task flag to disable network access where
>> possible in unpack/patch/configure/compile/install tasks.
>>
>> We can't disable networking in sstate tasks due to sstate downloads and
>> also so we can report hash equivalence to the server.
> 
> Since no tasks except fetch (and apparently the sstate tasks) are expected
> to use the network, wouldn't it make more sense to reverse this flag? I.e.,
> add do_fetch[network] = "1" instead. That way you don't get away with
> adding some random task and using the network from it unless you explicitly
> state that you will.

This is actually a brilliant idea, which would also make it easier to 
control this behavior from a user's perspective

> 
>>
>> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
>> ---
>>   meta/classes/base.bbclass  | 4 ++++
>>   meta/classes/patch.bbclass | 1 +
>>   2 files changed, 5 insertions(+)
>>
>> diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
>> index b709777f243..e4c6c983b59 100644
>> --- a/meta/classes/base.bbclass
>> +++ b/meta/classes/base.bbclass
>> @@ -214,6 +214,7 @@ python create_source_date_epoch_stamp() {
>>       oe.reproducible.epochfile_write(source_date_epoch, d.getVar('SDE_FILE'), d)
>>   }
>>   do_unpack[postfuncs] += "create_source_date_epoch_stamp"
>> +do_unpack[nonetwork] = "1"
>>
>>   def get_source_date_epoch_value(d):
>>       return oe.reproducible.epochfile_read(d.getVar('SDE_FILE'), d)
>> @@ -358,6 +359,7 @@ base_do_configure() {
>>   		echo ${BB_TASKHASH} > ${CONFIGURESTAMPFILE}
>>   	fi
>>   }
>> +do_configure[nonetwork] = "1"
>>
>>   addtask compile after do_configure
>>   do_compile[dirs] = "${B}"
>> @@ -368,11 +370,13 @@ base_do_compile() {
>>   		bbnote "nothing to compile"
>>   	fi
>>   }
>> +do_compile[nonetwork] = "1"
>>
>>   addtask install after do_compile
>>   do_install[dirs] = "${B}"
>>   # Remove and re-create ${D} so that is it guaranteed to be empty
>>   do_install[cleandirs] = "${D}"
>> +do_install[nonetwork] = "1"
>>
>>   base_do_install() {
>>   	:
>> diff --git a/meta/classes/patch.bbclass b/meta/classes/patch.bbclass
>> index 8de70254919..57aaf7c31d1 100644
>> --- a/meta/classes/patch.bbclass
>> +++ b/meta/classes/patch.bbclass
>> @@ -164,6 +164,7 @@ patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
>>
>>   addtask patch after do_unpack
>>   do_patch[dirs] = "${WORKDIR}"
>> +do_patch[nonetwork] = "1"
>>   do_patch[depends] = "${PATCHDEPENDENCY}"
>>
>>   EXPORT_FUNCTIONS do_patch
>> --
>> 2.32.0
> 
> //Peter
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#159975): https://lists.openembedded.org/g/openembedded-core/message/159975
> Mute This Topic: https://lists.openembedded.org/mt/87909311/3647476
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [kweihmann@outlook.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Richard Purdie Dec. 23, 2021, 1:11 p.m. UTC | #4
On Thu, 2021-12-23 at 12:31 +0100, Konrad Weihmann wrote:
> 
> On 23.12.21 11:49, Peter Kjellerstedt wrote:
> > > -----Original Message-----
> > > From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
> > > Sent: den 23 december 2021 00:21
> > > To: openembedded-core@lists.openembedded.org
> > > Subject: [OE-core] [PATCH] base/patch: Disable network for unpack/patch/configure/compile/install
> > > 
> > > Use the newly added nonetwork task flag to disable network access where
> > > possible in unpack/patch/configure/compile/install tasks.
> > > 
> > > We can't disable networking in sstate tasks due to sstate downloads and
> > > also so we can report hash equivalence to the server.
> > 
> > Since no tasks except fetch (and apparently the sstate tasks) are expected
> > to use the network, wouldn't it make more sense to reverse this flag? I.e.,
> > add do_fetch[network] = "1" instead. That way you don't get away with
> > adding some random task and using the network from it unless you explicitly
> > state that you will.
> 
> This is actually a brilliant idea, which would also make it easier to 
> control this behavior from a user's perspective

Part of me wonders if we really do want to make this "easy" for the user :/

Cheers,

Richard
Richard Purdie Dec. 23, 2021, 1:12 p.m. UTC | #5
On Thu, 2021-12-23 at 08:28 +0300, Alexander Kanavin wrote:
> Should there be tests for this? Would be good to check that the network is
> indeed disabled in these tasks.

I think Ross was looking at that. The challenge will be the list needed to know
which distros this works on and which it does not (I know centos7, debian9 and
debian10 don't work)...

Cheers,

Richard
Konrad Weihmann Dec. 23, 2021, 1:19 p.m. UTC | #6
On 23.12.21 14:11, Richard Purdie wrote:
> On Thu, 2021-12-23 at 12:31 +0100, Konrad Weihmann wrote:
>>
>> On 23.12.21 11:49, Peter Kjellerstedt wrote:
>>>> -----Original Message-----
>>>> From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
>>>> Sent: den 23 december 2021 00:21
>>>> To: openembedded-core@lists.openembedded.org
>>>> Subject: [OE-core] [PATCH] base/patch: Disable network for unpack/patch/configure/compile/install
>>>>
>>>> Use the newly added nonetwork task flag to disable network access where
>>>> possible in unpack/patch/configure/compile/install tasks.
>>>>
>>>> We can't disable networking in sstate tasks due to sstate downloads and
>>>> also so we can report hash equivalence to the server.
>>>
>>> Since no tasks except fetch (and apparently the sstate tasks) are expected
>>> to use the network, wouldn't it make more sense to reverse this flag? I.e.,
>>> add do_fetch[network] = "1" instead. That way you don't get away with
>>> adding some random task and using the network from it unless you explicitly
>>> state that you will.
>>
>> This is actually a brilliant idea, which would also make it easier to
>> control this behavior from a user's perspective
> 
> Part of me wonders if we really do want to make this "easy" for the user :/

"easier" may have sounded wrong - what I mean is that it would be then 
explicit, with no network access being the default.

So for instance a simple grep on a new layer could confirm whether a 
layer needs network access or not - while with the opt-out method of 
your patch things may be a bit more difficult to check.

side note: this could also be put into the check-layer script (as I 
think the official badge shall just go to layers that do not need 
network access out of the scope defined by core)

And yes I got a ton of additional tasks in the layers I work with and 
none of these would be affected by the opt-in model.

Either way, I really love to see this feature finally happen

> 
> Cheers,
> 
> Richard
>
Andreas Müller Dec. 23, 2021, 2:52 p.m. UTC | #7
On Thu, Dec 23, 2021 at 2:19 PM Konrad Weihmann <kweihmann@outlook.com> wrote:
>
>
>
> On 23.12.21 14:11, Richard Purdie wrote:
> > On Thu, 2021-12-23 at 12:31 +0100, Konrad Weihmann wrote:
> >>
> >> On 23.12.21 11:49, Peter Kjellerstedt wrote:
> >>>> -----Original Message-----
> >>>> From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
> >>>> Sent: den 23 december 2021 00:21
> >>>> To: openembedded-core@lists.openembedded.org
> >>>> Subject: [OE-core] [PATCH] base/patch: Disable network for unpack/patch/configure/compile/install
> >>>>
> >>>> Use the newly added nonetwork task flag to disable network access where
> >>>> possible in unpack/patch/configure/compile/install tasks.
> >>>>
> >>>> We can't disable networking in sstate tasks due to sstate downloads and
> >>>> also so we can report hash equivalence to the server.
> >>>
> >>> Since no tasks except fetch (and apparently the sstate tasks) are expected
> >>> to use the network, wouldn't it make more sense to reverse this flag? I.e.,
> >>> add do_fetch[network] = "1" instead. That way you don't get away with
> >>> adding some random task and using the network from it unless you explicitly
> >>> state that you will.
> >>
> >> This is actually a brilliant idea, which would also make it easier to
> >> control this behavior from a user's perspective
> >
> > Part of me wonders if we really do want to make this "easy" for the user :/
>
> "easier" may have sounded wrong - what I mean is that it would be then
> explicit, with no network access being the default.
>
> So for instance a simple grep on a new layer could confirm whether a
> layer needs network access or not - while with the opt-out method of
> your patch things may be a bit more difficult to check.
>
> side note: this could also be put into the check-layer script (as I
> think the official badge shall just go to layers that do not need
> network access out of the scope defined by core)
>
> And yes I got a ton of additional tasks in the layers I work with and
> none of these would be affected by the opt-in model.
>
> Either way, I really love to see this feature finally happen
>
Out of curiosity (will see fallout at least on libreoffice): Why is
this a feature to love?

Andreas
Konrad Weihmann Dec. 23, 2021, 3:01 p.m. UTC | #8
On 23.12.21 15:52, Andreas Müller wrote:
> On Thu, Dec 23, 2021 at 2:19 PM Konrad Weihmann <kweihmann@outlook.com> wrote:
>>
>>
>>
>> On 23.12.21 14:11, Richard Purdie wrote:
>>> On Thu, 2021-12-23 at 12:31 +0100, Konrad Weihmann wrote:
>>>>
>>>> On 23.12.21 11:49, Peter Kjellerstedt wrote:
>>>>>> -----Original Message-----
>>>>>> From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
>>>>>> Sent: den 23 december 2021 00:21
>>>>>> To: openembedded-core@lists.openembedded.org
>>>>>> Subject: [OE-core] [PATCH] base/patch: Disable network for unpack/patch/configure/compile/install
>>>>>>
>>>>>> Use the newly added nonetwork task flag to disable network access where
>>>>>> possible in unpack/patch/configure/compile/install tasks.
>>>>>>
>>>>>> We can't disable networking in sstate tasks due to sstate downloads and
>>>>>> also so we can report hash equivalence to the server.
>>>>>
>>>>> Since no tasks except fetch (and apparently the sstate tasks) are expected
>>>>> to use the network, wouldn't it make more sense to reverse this flag? I.e.,
>>>>> add do_fetch[network] = "1" instead. That way you don't get away with
>>>>> adding some random task and using the network from it unless you explicitly
>>>>> state that you will.
>>>>
>>>> This is actually a brilliant idea, which would also make it easier to
>>>> control this behavior from a user's perspective
>>>
>>> Part of me wonders if we really do want to make this "easy" for the user :/
>>
>> "easier" may have sounded wrong - what I mean is that it would be then
>> explicit, with no network access being the default.
>>
>> So for instance a simple grep on a new layer could confirm whether a
>> layer needs network access or not - while with the opt-out method of
>> your patch things may be a bit more difficult to check.
>>
>> side note: this could also be put into the check-layer script (as I
>> think the official badge shall just go to layers that do not need
>> network access out of the scope defined by core)
>>
>> And yes I got a ton of additional tasks in the layers I work with and
>> none of these would be affected by the opt-in model.
>>
>> Either way, I really love to see this feature finally happen
>>
> Out of curiosity (will see fallout at least on libreoffice): Why is
> this a feature to love?

Simply said because it makes validation of changes way easier.
In before I had to build a change in a normal environment (w network 
access) to verify that it actually builds, then populate some mirror and 
rebuild the same change in an environment without network access.
Now I only will need one run to be sure that it the same recipe will 
even build in 10yrs from now (assuming that a dl-mirror exists).

> 
> Andreas
>
Jose Quaresma Dec. 23, 2021, 3:11 p.m. UTC | #9
Peter Kjellerstedt <peter.kjellerstedt@axis.com> escreveu no dia quinta,
23/12/2021 à(s) 10:49:

> > -----Original Message-----
> > From: openembedded-core@lists.openembedded.org <
> openembedded-core@lists.openembedded.org> On Behalf Of Richard Purdie
> > Sent: den 23 december 2021 00:21
> > To: openembedded-core@lists.openembedded.org
> > Subject: [OE-core] [PATCH] base/patch: Disable network for
> unpack/patch/configure/compile/install
> >
> > Use the newly added nonetwork task flag to disable network access where
> > possible in unpack/patch/configure/compile/install tasks.
> >
> > We can't disable networking in sstate tasks due to sstate downloads and
> > also so we can report hash equivalence to the server.
>
> Since no tasks except fetch (and apparently the sstate tasks) are expected
> to use the network, wouldn't it make more sense to reverse this flag?
> I.e.,
> add do_fetch[network] = "1" instead. That way you don't get away with
> adding some random task and using the network from it unless you
> explicitly
> state that you will.
>

It is more safe and easy to check what are the tasks that need network
access.

Jose


>
> >
> > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> > ---
> >  meta/classes/base.bbclass  | 4 ++++
> >  meta/classes/patch.bbclass | 1 +
> >  2 files changed, 5 insertions(+)
> >
> > diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
> > index b709777f243..e4c6c983b59 100644
> > --- a/meta/classes/base.bbclass
> > +++ b/meta/classes/base.bbclass
> > @@ -214,6 +214,7 @@ python create_source_date_epoch_stamp() {
> >      oe.reproducible.epochfile_write(source_date_epoch,
> d.getVar('SDE_FILE'), d)
> >  }
> >  do_unpack[postfuncs] += "create_source_date_epoch_stamp"
> > +do_unpack[nonetwork] = "1"
> >
> >  def get_source_date_epoch_value(d):
> >      return oe.reproducible.epochfile_read(d.getVar('SDE_FILE'), d)
> > @@ -358,6 +359,7 @@ base_do_configure() {
> >               echo ${BB_TASKHASH} > ${CONFIGURESTAMPFILE}
> >       fi
> >  }
> > +do_configure[nonetwork] = "1"
> >
> >  addtask compile after do_configure
> >  do_compile[dirs] = "${B}"
> > @@ -368,11 +370,13 @@ base_do_compile() {
> >               bbnote "nothing to compile"
> >       fi
> >  }
> > +do_compile[nonetwork] = "1"
> >
> >  addtask install after do_compile
> >  do_install[dirs] = "${B}"
> >  # Remove and re-create ${D} so that is it guaranteed to be empty
> >  do_install[cleandirs] = "${D}"
> > +do_install[nonetwork] = "1"
> >
> >  base_do_install() {
> >       :
> > diff --git a/meta/classes/patch.bbclass b/meta/classes/patch.bbclass
> > index 8de70254919..57aaf7c31d1 100644
> > --- a/meta/classes/patch.bbclass
> > +++ b/meta/classes/patch.bbclass
> > @@ -164,6 +164,7 @@ patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
> >
> >  addtask patch after do_unpack
> >  do_patch[dirs] = "${WORKDIR}"
> > +do_patch[nonetwork] = "1"
> >  do_patch[depends] = "${PATCHDEPENDENCY}"
> >
> >  EXPORT_FUNCTIONS do_patch
> > --
> > 2.32.0
>
> //Peter
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#159975):
> https://lists.openembedded.org/g/openembedded-core/message/159975
> Mute This Topic: https://lists.openembedded.org/mt/87909311/5052612
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> quaresma.jose@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
Alexander Kanavin Dec. 23, 2021, 3:54 p.m. UTC | #10
On Thu, 23 Dec 2021 at 14:11, Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> > This is actually a brilliant idea, which would also make it easier to
> > control this behavior from a user's perspective
>
> Part of me wonders if we really do want to make this "easy" for the user :/
>

If people are determined to subvert this, they're likely to do it by
sandwiching additional tasks between configure/compile/install in both
cases. It's better to explicitly force adding the 'network' flag to those
extra tasks then, as it can be clearly seen, and questions can be asked in
code review.

Alex
Khem Raj Dec. 24, 2021, 6 a.m. UTC | #11
is this list of failures due to this patch ?
https://errors.yoctoproject.org/Errors/Build/137579/

On Wed, Dec 22, 2021 at 3:20 PM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> Use the newly added nonetwork task flag to disable network access where
> possible in unpack/patch/configure/compile/install tasks.
>
> We can't disable networking in sstate tasks due to sstate downloads and
> also so we can report hash equivalence to the server.
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  meta/classes/base.bbclass  | 4 ++++
>  meta/classes/patch.bbclass | 1 +
>  2 files changed, 5 insertions(+)
>
> diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
> index b709777f243..e4c6c983b59 100644
> --- a/meta/classes/base.bbclass
> +++ b/meta/classes/base.bbclass
> @@ -214,6 +214,7 @@ python create_source_date_epoch_stamp() {
>      oe.reproducible.epochfile_write(source_date_epoch, d.getVar('SDE_FILE'), d)
>  }
>  do_unpack[postfuncs] += "create_source_date_epoch_stamp"
> +do_unpack[nonetwork] = "1"
>
>  def get_source_date_epoch_value(d):
>      return oe.reproducible.epochfile_read(d.getVar('SDE_FILE'), d)
> @@ -358,6 +359,7 @@ base_do_configure() {
>                 echo ${BB_TASKHASH} > ${CONFIGURESTAMPFILE}
>         fi
>  }
> +do_configure[nonetwork] = "1"
>
>  addtask compile after do_configure
>  do_compile[dirs] = "${B}"
> @@ -368,11 +370,13 @@ base_do_compile() {
>                 bbnote "nothing to compile"
>         fi
>  }
> +do_compile[nonetwork] = "1"
>
>  addtask install after do_compile
>  do_install[dirs] = "${B}"
>  # Remove and re-create ${D} so that is it guaranteed to be empty
>  do_install[cleandirs] = "${D}"
> +do_install[nonetwork] = "1"
>
>  base_do_install() {
>         :
> diff --git a/meta/classes/patch.bbclass b/meta/classes/patch.bbclass
> index 8de70254919..57aaf7c31d1 100644
> --- a/meta/classes/patch.bbclass
> +++ b/meta/classes/patch.bbclass
> @@ -164,6 +164,7 @@ patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
>
>  addtask patch after do_unpack
>  do_patch[dirs] = "${WORKDIR}"
> +do_patch[nonetwork] = "1"
>  do_patch[depends] = "${PATCHDEPENDENCY}"
>
>  EXPORT_FUNCTIONS do_patch
> --
> 2.32.0
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#159965): https://lists.openembedded.org/g/openembedded-core/message/159965
> Mute This Topic: https://lists.openembedded.org/mt/87909311/1997914
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [raj.khem@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Richard Purdie Dec. 24, 2021, 8:30 a.m. UTC | #12
On Thu, 2021-12-23 at 22:00 -0800, Khem Raj wrote:
> is this list of failures due to this patch ?
> https://errors.yoctoproject.org/Errors/Build/137579/

Looks likely, yes. Recipes shouldn't be attempting downloads in do_compile :(

Cheers,

Richard
Konrad Weihmann Dec. 24, 2021, 10:36 a.m. UTC | #13
I had a look at the failures and most of them are fairly easy to fix - 
all but the go recipes like influxdb.

I'm not sure how to deal with that, so there aren't that many options here.

Either
- inject all the needed with a fixed revision, but that would prevent 
devtool from catching updates to them properly
- create recipes for all of them and pray that there won't be a circular 
dependency in any of them
- allow go to work with network (the least favorable option IMO)

in the case of influxdb for instance we are talking about

github.com/peterh/liner
golang.org/x/crypto
github.com/influxdata/influxql
github.com/influxdata/flux
github.com/BurntSushi/toml
github.com/influxdata/usage-client
golang.org/x/text
go.uber.org/zap
github.com/klauspost/pgzip
github.com/prometheus/client_golang
github.com/jsternberg/zap-logfmt
github.com/mattn/go-isatty
collectd.org
github.com/bmizerany/pat
github.com/dgrijalva/jwt-go/v4
github.com/gogo/protobuf
github.com/golang/snappy
github.com/tinylib/msgp
github.com/opentracing/opentracing-go
golang.org/x/sync
github.com/apache/arrow/go/arrow
github.com/pkg/errors
google.golang.org/grpc
github.com/kraj/xxhash
github.com/influxdata/roaring
github.com/xlab/treeprint
golang.org/x/time
golang.org/x/sys
github.com/jwilder/encoding
github.com/dgryski/go-bitstream

where especially the golang.org/x modules are known to have circular 
dependencies like x/a relies on x/b, while x/b requires x/c and x/c 
needs x/a for building.

Not sure how to tackle this - any thoughts?

On 24.12.21 09:30, Richard Purdie wrote:
> On Thu, 2021-12-23 at 22:00 -0800, Khem Raj wrote:
>> is this list of failures due to this patch ?
>> https://errors.yoctoproject.org/Errors/Build/137579/
> 
> Looks likely, yes. Recipes shouldn't be attempting downloads in do_compile :(
> 
> Cheers,
> 
> Richard
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#159996): https://lists.openembedded.org/g/openembedded-core/message/159996
> Mute This Topic: https://lists.openembedded.org/mt/87909311/3647476
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [kweihmann@outlook.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Stefan Herbrechtsmeier Dec. 25, 2021, 7:25 p.m. UTC | #14
Hi Richard,

Am 24.12.21 um 09:30 schrieb Richard Purdie:
> On Thu, 2021-12-23 at 22:00 -0800, Khem Raj wrote:
>> is this list of failures due to this patch ?
>> https://errors.yoctoproject.org/Errors/Build/137579/
> 
> Looks likely, yes. Recipes shouldn't be attempting downloads in do_compile :(

In this case the go and especially go-mod class is broken by design 
because it downloads all its dependencies after do_fetch.

Regards
   Stefan
Stefan Herbrechtsmeier Dec. 25, 2021, 7:32 p.m. UTC | #15
Hi Konrad,

Am 24.12.21 um 11:36 schrieb Konrad Weihmann:
> I had a look at the failures and most of them are fairly easy to fix - 
> all but the go recipes like influxdb.

The go class doesn't work without network by default if the project 
doesn't provide a vendor folder.

> I'm not sure how to deal with that, so there aren't that many options here.

This is a common problem for all language specific package managers 
(python / pip, Node.js / npm, Rust / Carge, go) and we need a common 
solution.

> Either
> - inject all the needed with a fixed revision, but that would prevent 
> devtool from catching updates to them properly

This is possible if we extend recipetool to add the dependencies to the 
recipe. Rust and npm are using this option and I have a WIP to improve 
the npm solution. But in any case we loose a lot of advantages of OE.

> - create recipes for all of them and pray that there won't be a circular 
> dependency in any of them

The circular dependency is only the least problem and could be fixed by 
splitting the dependency and code into separate recipes. The main 
problem is the huge increase of recipe count and parse time. Furthermore 
we need the possibility to test recipe updates of dependencies. Python 
is using this option.

> - allow go to work with network (the least favorable option IMO)

This is how the current go and especially go-mod class work.

> in the case of influxdb for instance we are talking about
> 
> github.com/peterh/liner
> golang.org/x/crypto
> github.com/influxdata/influxql
> github.com/influxdata/flux
> github.com/BurntSushi/toml
> github.com/influxdata/usage-client
> golang.org/x/text
> go.uber.org/zap
> github.com/klauspost/pgzip
> github.com/prometheus/client_golang
> github.com/jsternberg/zap-logfmt
> github.com/mattn/go-isatty
> collectd.org
> github.com/bmizerany/pat
> github.com/dgrijalva/jwt-go/v4
> github.com/gogo/protobuf
> github.com/golang/snappy
> github.com/tinylib/msgp
> github.com/opentracing/opentracing-go
> golang.org/x/sync
> github.com/apache/arrow/go/arrow
> github.com/pkg/errors
> google.golang.org/grpc
> github.com/kraj/xxhash
> github.com/influxdata/roaring
> github.com/xlab/treeprint
> golang.org/x/time
> golang.org/x/sys
> github.com/jwilder/encoding
> github.com/dgryski/go-bitstream
> 
> where especially the golang.org/x modules are known to have circular 
> dependencies like x/a relies on x/b, while x/b requires x/c and x/c 
> needs x/a for building.

Isn't the circular dependencies problem mainly a problem of the native 
packages? In many cases the dependency only exists at compile time of 
the main project and not at compile time of the dependency itself. In 
case of a native package all dependency sysroots need to be finished 
before the recipe sysroot itself.

Regards
   Stefan
Khem Raj Dec. 25, 2021, 7:41 p.m. UTC | #16
On Fri, Dec 24, 2021 at 2:36 AM Konrad Weihmann <kweihmann@outlook.com> wrote:
>
> I had a look at the failures and most of them are fairly easy to fix -
> all but the go recipes like influxdb.
>
> I'm not sure how to deal with that, so there aren't that many options here.
>
> Either
> - inject all the needed with a fixed revision, but that would prevent
> devtool from catching updates to them properly
> - create recipes for all of them and pray that there won't be a circular
> dependency in any of them
> - allow go to work with network (the least favorable option IMO)
>
> in the case of influxdb for instance we are talking about
>
> github.com/peterh/liner
> golang.org/x/crypto
> github.com/influxdata/influxql
> github.com/influxdata/flux
> github.com/BurntSushi/toml
> github.com/influxdata/usage-client
> golang.org/x/text
> go.uber.org/zap
> github.com/klauspost/pgzip
> github.com/prometheus/client_golang
> github.com/jsternberg/zap-logfmt
> github.com/mattn/go-isatty
> collectd.org
> github.com/bmizerany/pat
> github.com/dgrijalva/jwt-go/v4
> github.com/gogo/protobuf
> github.com/golang/snappy
> github.com/tinylib/msgp
> github.com/opentracing/opentracing-go
> golang.org/x/sync
> github.com/apache/arrow/go/arrow
> github.com/pkg/errors
> google.golang.org/grpc
> github.com/kraj/xxhash
> github.com/influxdata/roaring
> github.com/xlab/treeprint
> golang.org/x/time
> golang.org/x/sys
> github.com/jwilder/encoding
> github.com/dgryski/go-bitstream
>
> where especially the golang.org/x modules are known to have circular
> dependencies like x/a relies on x/b, while x/b requires x/c and x/c
> needs x/a for building.
>
> Not sure how to tackle this - any thoughts?

right. Unless we can fix these modern language ecosystem use in OE
this is going to be quite worrisome to enforce
unless there is a tool for folks to convert these innumerable
dependencies into DEPENDS or SRC_URIs, perhaps devtool
could be enhanced. Asking folks to add so many entries by hand is a no go.

Perhaps its better to apply this in stages, where it is enabled on
C/C++ and some of other ecosystems where package management
and build systems are not part of language ecosytem.

I don't even have a formulated idea in my mind as to how to solve them
for OE and bitbake method of fetching.

>
> On 24.12.21 09:30, Richard Purdie wrote:
> > On Thu, 2021-12-23 at 22:00 -0800, Khem Raj wrote:
> >> is this list of failures due to this patch ?
> >> https://errors.yoctoproject.org/Errors/Build/137579/
> >
> > Looks likely, yes. Recipes shouldn't be attempting downloads in do_compile :(
> >
> > Cheers,
> >
> > Richard
> >
> >
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#159996): https://lists.openembedded.org/g/openembedded-core/message/159996
> > Mute This Topic: https://lists.openembedded.org/mt/87909311/3647476
> > Group Owner: openembedded-core+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [kweihmann@outlook.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
Alexander Kanavin Dec. 25, 2021, 7:41 p.m. UTC | #17
On Sat, 25 Dec 2021 at 20:32, Stefan Herbrechtsmeier <
stefan@herbrechtsmeier.net> wrote:

> > I'm not sure how to deal with that, so there aren't that many options
> here.
>
> This is a common problem for all language specific package managers
> (python / pip, Node.js / npm, Rust / Carge, go) and we need a common
> solution.
>

I tend to think that the best (and the hardest) option is to improve these
tools so that they're usable inside do_fetch (e.g. fulfil the
caching/reproducibility criteria for a bitbake fetcher), and the needed
changes are acceptable to upstreams.

Alex
Konrad Weihmann Dec. 25, 2021, 8:43 p.m. UTC | #18
What I so far don't really get is why increase in parsing time is such a 
big deal.
I admit when we're talking about npm it's some kind of a drastic 
increase in recipes one would have to maintain, just because some random 
project decides to use a trillion dependencies instead of writing two or 
three lines of code more.

Still I come to think this might be actually beneficial, as it shows how 
broken npm is from a distribution perspective - as it may be that some 
users actually start to access the situation when they are actually 
aware what monstrosity of a dep tree they are inheriting by just a 
single npm module.

So this is mind - and I don't want to sound radical - I would rather 
abandon npm and go support in core than to sacrifice closing the one 
main loophole in core that is preventing true accessibility at the moment.
Which is uncontrolled fetching outside of the fetching task, as this 
invalidates everything you will get in the end in terms of licensing, 
quality reports that haven't been done as part of the build and even CVE 
checking becomes pretty much worthless if one is allowed to inject 
random code on the fly into the build - and in the end everything that 
can't be assessed outside of the build is pretty much non-existing for 
most of the assessments I had to work with.

In the past I showed that npm and go can be made to work with these 
principles, even if they would introduce a different way of working and 
maybe the need for better tooling - *but* they can work with how oe-core 
works at the moment - to the expense of parsing time - and that's pretty 
much it from my point of view.

Still the gains outweigh it by far in my opinion, as it would make all 
of that accessible by common tooling already in place - including true 
reproducibility and builtin quality reports.

BTW I don't think rust and therefore cargo is heavily affected by it, as 
the cargo to bitbake scripting works kind in the way I would imagine for 
npm and go too - so far just npm and go are really really bad to handle 
in this case.

I'm happy to help on such scripting and tooling, but I don't see much 
worth in either accepting the fact that "modern languages" have to 
include some not validatable "magic" (which then would mean allowing 
uncontrolled network access) or working around the fact that a trillion 
dependencies is still a trillion dependencies no matter how you put it ( 
:) )

Regards
Konrad
Konrad Weihmann Dec. 25, 2021, 8:58 p.m. UTC | #19
On 25.12.21 20:32, Stefan Herbrechtsmeier wrote:
> 
> Isn't the circular dependencies problem mainly a problem of the native 
> packages? In many cases the dependency only exists at compile time of 
> the main project and not at compile time of the dependency itself. In 
> case of a native package all dependency sysroots need to be finished 
> before the recipe sysroot itself.

Unfortunately it is not - it's happens that the go.mod already creates 
this circular dependency.

The classic example is that golang.org/x/tools depends on parts of 
golang.org/x/text (and guess what) golang.org/x/text depends solely on 
golang.org/x/tools :facepalm:

For go itself this isn't a big issue, as the compiler just pulls all the 
needed code into the compile workspace and build a binary out of it, 
which can be reused later on.

But for bitbake it's a big issue, as either you pull in

golang.org/x/text + golang.org/x/tools sources into the 
golang.org/x/text workspace, stripping you off the possibility to 
properly watch the revision of golang.org/x/tools (in terms of devtools 
and co) - or you simply end up with a circular dependency.

I scratched my head on this one for weeks, till I found a way to 
actually use the latest greatest of each go module each coming as a 
recipe of its own, while avoiding above mentioned situations -- and it's 
kind of a huge break to how things are done right now.

I'm pretty sure the same will happen in rust sooner or later.

And I personally don't see how to break it as I don't know any way go 
distinguishes between compile and runtime dependency - afaik they all 
need to be present at compile time
Konrad Weihmann Dec. 25, 2021, 9:09 p.m. UTC | #20
One more thing that came to my mind is that the project is missing demo 
recipes/real life examples in core that actually would show that one can 
achieve reproducibility in the sense of the project's definition for 
both npm and go.
So far there is the fetcher code but real life users are living elsewhere.

So one could add consumers to core or push the code to where the "real 
action" is (meta-virt for instance comes to my mind)

On 25.12.21 21:43, Konrad Weihmann wrote:
> What I so far don't really get is why increase in parsing time is such a 
> big deal.
> I admit when we're talking about npm it's some kind of a drastic 
> increase in recipes one would have to maintain, just because some random 
> project decides to use a trillion dependencies instead of writing two or 
> three lines of code more.
> 
> Still I come to think this might be actually beneficial, as it shows how 
> broken npm is from a distribution perspective - as it may be that some 
> users actually start to access the situation when they are actually 
> aware what monstrosity of a dep tree they are inheriting by just a 
> single npm module.
> 
> So this is mind - and I don't want to sound radical - I would rather 
> abandon npm and go support in core than to sacrifice closing the one 
> main loophole in core that is preventing true accessibility at the moment.
> Which is uncontrolled fetching outside of the fetching task, as this 
> invalidates everything you will get in the end in terms of licensing, 
> quality reports that haven't been done as part of the build and even CVE 
> checking becomes pretty much worthless if one is allowed to inject 
> random code on the fly into the build - and in the end everything that 
> can't be assessed outside of the build is pretty much non-existing for 
> most of the assessments I had to work with.
> 
> In the past I showed that npm and go can be made to work with these 
> principles, even if they would introduce a different way of working and 
> maybe the need for better tooling - *but* they can work with how oe-core 
> works at the moment - to the expense of parsing time - and that's pretty 
> much it from my point of view.
> 
> Still the gains outweigh it by far in my opinion, as it would make all 
> of that accessible by common tooling already in place - including true 
> reproducibility and builtin quality reports.
> 
> BTW I don't think rust and therefore cargo is heavily affected by it, as 
> the cargo to bitbake scripting works kind in the way I would imagine for 
> npm and go too - so far just npm and go are really really bad to handle 
> in this case.
> 
> I'm happy to help on such scripting and tooling, but I don't see much 
> worth in either accepting the fact that "modern languages" have to 
> include some not validatable "magic" (which then would mean allowing 
> uncontrolled network access) or working around the fact that a trillion 
> dependencies is still a trillion dependencies no matter how you put it ( 
> :) )
> 
> Regards
> Konrad
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#160008): https://lists.openembedded.org/g/openembedded-core/message/160008
> Mute This Topic: https://lists.openembedded.org/mt/87909311/3647476
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [kweihmann@outlook.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Stefan Herbrechtsmeier Dec. 27, 2021, 12:54 p.m. UTC | #21
Hi Konrad,

Am 25.12.21 um 21:43 schrieb Konrad Weihmann:
> What I so far don't really get is why increase in parsing time is such a 
> big deal.
> I admit when we're talking about npm it's some kind of a drastic 
> increase in recipes one would have to maintain, just because some random 
> project decides to use a trillion dependencies instead of writing two or 
> three lines of code more.
> 
> Still I come to think this might be actually beneficial, as it shows how 
> broken npm is from a distribution perspective - as it may be that some 
> users actually start to access the situation when they are actually 
> aware what monstrosity of a dep tree they are inheriting by just a 
> single npm module.
> 
> So this is mind - and I don't want to sound radical - I would rather 
> abandon npm and go support in core than to sacrifice closing the one 
> main loophole in core that is preventing true accessibility at the moment.
> Which is uncontrolled fetching outside of the fetching task, as this 
> invalidates everything you will get in the end in terms of licensing, 
> quality reports that haven't been done as part of the build and even CVE 
> checking becomes pretty much worthless if one is allowed to inject 
> random code on the fly into the build - and in the end everything that 
> can't be assessed outside of the build is pretty much non-existing for 
> most of the assessments I had to work with.
> 
> In the past I showed that npm and go can be made to work with these 
> principles, even if they would introduce a different way of working and 
> maybe the need for better tooling - *but* they can work with how oe-core 
> works at the moment - to the expense of parsing time - and that's pretty 
> much it from my point of view.
> 
> Still the gains outweigh it by far in my opinion, as it would make all 
> of that accessible by common tooling already in place - including true 
> reproducibility and builtin quality reports.
> 
> BTW I don't think rust and therefore cargo is heavily affected by it, as 
> the cargo to bitbake scripting works kind in the way I would imagine for 
> npm and go too - so far just npm and go are really really bad to handle 
> in this case.
> 
> I'm happy to help on such scripting and tooling, but I don't see much 
> worth in either accepting the fact that "modern languages" have to 
> include some not validatable "magic" (which then would mean allowing 
> uncontrolled network access) or working around the fact that a trillion 
> dependencies is still a trillion dependencies no matter how you put it ( 
> :) )

I fully agree with you and would be happy to add a 'recursive' option to 
devtool / recipetool to create multiple recipes on the fly instead of 
one big recipe with more than 100 different projects inclusive duplicate 
and different compatible versions.
Stefan Herbrechtsmeier Dec. 27, 2021, 1:13 p.m. UTC | #22
Am 25.12.21 um 21:58 schrieb Konrad Weihmann:
> 
> 
> On 25.12.21 20:32, Stefan Herbrechtsmeier wrote:
>>
>> Isn't the circular dependencies problem mainly a problem of the native 
>> packages? In many cases the dependency only exists at compile time of 
>> the main project and not at compile time of the dependency itself. In 
>> case of a native package all dependency sysroots need to be finished 
>> before the recipe sysroot itself.
> 
> Unfortunately it is not - it's happens that the go.mod already creates 
> this circular dependency.
> 
> The classic example is that golang.org/x/tools depends on parts of 
> golang.org/x/text (and guess what) golang.org/x/text depends solely on 
> golang.org/x/tools :facepalm:

Does this circular dependency exists in the executables (main) only?

> For go itself this isn't a big issue, as the compiler just pulls all the 
> needed code into the compile workspace and build a binary out of it, 
> which can be reused later on.
> 
> But for bitbake it's a big issue, as either you pull in
> 
> golang.org/x/text + golang.org/x/tools sources into the 
> golang.org/x/text workspace, stripping you off the possibility to 
> properly watch the revision of golang.org/x/tools (in terms of devtools 
> and co) - or you simply end up with a circular dependency.
> 
> I scratched my head on this one for weeks, till I found a way to 
> actually use the latest greatest of each go module each coming as a 
> recipe of its own, while avoiding above mentioned situations -- and it's 
> kind of a huge break to how things are done right now.
> 
> I'm pretty sure the same will happen in rust sooner or later.
> 
> And I personally don't see how to break it as I don't know any way go 
> distinguishes between compile and runtime dependency - afaik they all 
> need to be present at compile time

Is it possible to create a recipe for the source and a recipe for the 
binaries which depends on the source recipe? In this case the DEPENDS is 
always the source package and the RDEPENDS the binary package. A 
BBCLASSEXTEND could be used to create the source recipe automatically.
Konrad Weihmann Dec. 27, 2021, 1:22 p.m. UTC | #23
On 27.12.21 13:54, Stefan Herbrechtsmeier wrote:
> Hi Konrad,
> 
> Am 25.12.21 um 21:43 schrieb Konrad Weihmann:
>> What I so far don't really get is why increase in parsing time is such 
>> a big deal.
>> I admit when we're talking about npm it's some kind of a drastic 
>> increase in recipes one would have to maintain, just because some 
>> random project decides to use a trillion dependencies instead of 
>> writing two or three lines of code more.
>>
>> Still I come to think this might be actually beneficial, as it shows 
>> how broken npm is from a distribution perspective - as it may be that 
>> some users actually start to access the situation when they are 
>> actually aware what monstrosity of a dep tree they are inheriting by 
>> just a single npm module.
>>
>> So this is mind - and I don't want to sound radical - I would rather 
>> abandon npm and go support in core than to sacrifice closing the one 
>> main loophole in core that is preventing true accessibility at the 
>> moment.
>> Which is uncontrolled fetching outside of the fetching task, as this 
>> invalidates everything you will get in the end in terms of licensing, 
>> quality reports that haven't been done as part of the build and even 
>> CVE checking becomes pretty much worthless if one is allowed to inject 
>> random code on the fly into the build - and in the end everything that 
>> can't be assessed outside of the build is pretty much non-existing for 
>> most of the assessments I had to work with.
>>
>> In the past I showed that npm and go can be made to work with these 
>> principles, even if they would introduce a different way of working 
>> and maybe the need for better tooling - *but* they can work with how 
>> oe-core works at the moment - to the expense of parsing time - and 
>> that's pretty much it from my point of view.
>>
>> Still the gains outweigh it by far in my opinion, as it would make all 
>> of that accessible by common tooling already in place - including true 
>> reproducibility and builtin quality reports.
>>
>> BTW I don't think rust and therefore cargo is heavily affected by it, 
>> as the cargo to bitbake scripting works kind in the way I would 
>> imagine for npm and go too - so far just npm and go are really really 
>> bad to handle in this case.
>>
>> I'm happy to help on such scripting and tooling, but I don't see much 
>> worth in either accepting the fact that "modern languages" have to 
>> include some not validatable "magic" (which then would mean allowing 
>> uncontrolled network access) or working around the fact that a 
>> trillion dependencies is still a trillion dependencies no matter how 
>> you put it ( :) )
> 
> I fully agree with you and would be happy to add a 'recursive' option to 
> devtool / recipetool to create multiple recipes on the fly instead of 
> one big recipe with more than 100 different projects inclusive duplicate 
> and different compatible versions.

That, yes - and a split between sources and recipes, I guess, is needed.
Like creation of a sharable inc-file for definition of sources (SRC_URI 
and dependencies) in addition to the actual recipe itself.
(One could also think about versioned inc-files here)
Nevertheless this is needed to untangle go (luckily npm is much more 
straight forward in this sense).

Like one has a go module "a" that consists of an inc-file, which defines 
the needed sources for "a" and the needed dependencies in terms of other 
go modules.
And then you have the recipe "a" which just includes this inc file and 
go.bbclass (plus the usual recipe specific meta data).
Now I could build recipe a or I could build any other recipe that 
depends on this go module with a version I could define (or tune).

I know that this isn't something that is not very welcome, as projects 
tend to define their versioned dependencies with lock-files (and every 
language/eco system in their very own way) - still I think the "freedom 
of choice" for a version is something that is a superb feature when 
using yocto - so I personally would rather ignore the maybe 
outdated/wrong information of a project as to not have it (this choice) 
at all.

So if one could make that happen with devtool (maybe using templates or 
so) I think the project would be prepared to fully support these 
"modern" languages without giving up on some of the principles like 
reproducibility

-- to get back to the responses from you other mail --

 > Is it possible to create a recipe for the source and a recipe for the 
 > binaries which depends on the source recipe? In this case the DEPENDS 
 > is always the source package and the RDEPENDS the binary package.
 > A BBCLASSEXTEND could be used to create the source recipe automatically.

That could be worth a shot, as I think this could actually work - 
basically the same idea as above but with recipes instead of inc-files.
Stefan Herbrechtsmeier Dec. 27, 2021, 1:38 p.m. UTC | #24
Hi Alex,

Am 25.12.21 um 20:41 schrieb Alexander Kanavin:
> On Sat, 25 Dec 2021 at 20:32, Stefan Herbrechtsmeier 
> <stefan@herbrechtsmeier.net <mailto:stefan@herbrechtsmeier.net>> wrote:
> 
>      > I'm not sure how to deal with that, so there aren't that many
>     options here.
> 
>     This is a common problem for all language specific package managers
>     (python / pip, Node.js / npm, Rust / Carge, go) and we need a common
>     solution.
> 
> 
> I tend to think that the best (and the hardest) option is to improve 
> these tools so that they're usable inside do_fetch (e.g. fulfil the 
> caching/reproducibility criteria for a bitbake fetcher), and the needed 
> changes are acceptable to upstreams.

Is the fetcher really the problem? In all cases the input and output of 
the package manager fetch task is well defined. In the npm case the 
bitbake npmsw fetcher and my recipetool approach translate this 
configuration into bitbake fetch and unpack commands.

The real problem is the different philosophy between OE and the package 
manager. The package manager doesn't care about duplicate versions, 
maintenance versions, version updates of indirect dependencies, license 
compliance, CVE checks or dead code (examples, documentations, test, 
...) and if they care every package manager have its own solution.

Why C/C++ and Python doesn't fetch all its dependencies inside a single 
recipe and why do we try to replace embedded dependencies? I think we 
have good reasons for it and we shouldn't discard it for other languages.

Independent of the language an update of a dependency need a test inside 
a user and with Node-RED as an example I show that this is possible for 
npm modules.

Regards
   Stefan
Alexander Kanavin Dec. 27, 2021, 2:05 p.m. UTC | #25
On Mon, 27 Dec 2021 at 14:38, Stefan Herbrechtsmeier <
stefan@herbrechtsmeier.net> wrote:

> > I tend to think that the best (and the hardest) option is to improve
> > these tools so that they're usable inside do_fetch (e.g. fulfil the
> > caching/reproducibility criteria for a bitbake fetcher), and the needed
> > changes are acceptable to upstreams.
>
> Is the fetcher really the problem? In all cases the input and output of
> the package manager fetch task is well defined. In the npm case the
> bitbake npmsw fetcher and my recipetool approach translate this
> configuration into bitbake fetch and unpack commands.
>
> The real problem is the different philosophy between OE and the package
> manager. The package manager doesn't care about duplicate versions,
> maintenance versions, version updates of indirect dependencies, license
> compliance, CVE checks or dead code (examples, documentations, test,
> ...) and if they care every package manager have its own solution.
>

I feel that ship has sailed. The rest of the world has already chosen the
self-contained bundling via custom language-specific package management.
Keeping up with that via custom reimplementations of those tools will be a
never ending maintenance burden; I'd rather directly use the tools that
everyone else does. It's all open source and can be fixed.


> Why C/C++ and Python doesn't fetch all its dependencies inside a single
> recipe and why do we try to replace embedded dependencies? I think we
> have good reasons for it and we shouldn't discard it for other languages.
>

C/C++ is kind of moving that way too. Meson already has support for it; so
far it's optional, but it does exist. The reason you can still do manual
dependency management there is that those environments are from a different
era, and carry a lot of inertia.

Alex
Aaltonen Eero Dec. 27, 2021, 2:54 p.m. UTC | #26
On Mon, 2021-12-27 at 14:38 +0100, Stefan Herbrechtsmeier via
lists.openembedded.org wrote:
> Hi Alex,
> 
> Am 25.12.21 um 20:41 schrieb Alexander Kanavin:
> > On Sat, 25 Dec 2021 at 20:32, Stefan Herbrechtsmeier 
> > <stefan@herbrechtsmeier.net <mailto:stefan@herbrechtsmeier.net>>
> > wrote:
> > 
> >      > I'm not sure how to deal with that, so there aren't that
> > many options here.
> > 
> >     This is a common problem for all language specific package
> > managers
> >     (python / pip, Node.js / npm, Rust / Carge, go) and we need a
> > common solution.
> > 
> > 
> > I tend to think that the best (and the hardest) option is to
> > improve these tools so that they're usable inside do_fetch (e.g.
> > fulfil the caching/reproducibility criteria for a bitbake fetcher),
> > and the needed changes are acceptable to upstreams.
> 
> The real problem is the different philosophy between OE and the
> package manager. The package manager doesn't care about duplicate
> versions, maintenance versions, version updates of indirect
> dependencies, license compliance, CVE checks or dead code (examples,
> documentations, test, 
> ...) and if they care every package manager have its own solution.

The Java OSS ecosystem has had fairly well defined build time and
runtime dependency metadata commonly available for many years now. The
more widely used build tools are capable of using this metadata to
calculate a solution Set(lib, version) that satisfies the version
requirements of the dependency graph - or report that no solution is
available. Cryptographic hashes are however by default provided by the
artifact server(s).

For Python distribution packages, the 2020 pip resolver is advertised
to be better at calculating solutions to required package
dependencies. Cryptographic hashes are specified for pip, use is
optional. The PyPA Core metadata includes some potentially useful
fields, such as the license and what the project requires.

The language specific package managers could potentially be used to
export dependency metadata, and also potentially to check "with (foo
i.j.k) and (bar x.y.z), is there a set of compatible package versions
available for baz and it's dependencies?"


Regards,
Eero Aaltonen
Stefan Herbrechtsmeier Dec. 27, 2021, 3:47 p.m. UTC | #27
Am 27.12.21 um 15:54 schrieb Eero Aaltonen:
> On Mon, 2021-12-27 at 14:38 +0100, Stefan Herbrechtsmeier via
> lists.openembedded.org wrote:
>> Hi Alex,
>>
>> Am 25.12.21 um 20:41 schrieb Alexander Kanavin:
>>> On Sat, 25 Dec 2021 at 20:32, Stefan Herbrechtsmeier
>>> <stefan@herbrechtsmeier.net <mailto:stefan@herbrechtsmeier.net>>
>>> wrote:
>>>
>>>       > I'm not sure how to deal with that, so there aren't that
>>> many options here.
>>>
>>>      This is a common problem for all language specific package
>>> managers
>>>      (python / pip, Node.js / npm, Rust / Carge, go) and we need a
>>> common solution.
>>>
>>>
>>> I tend to think that the best (and the hardest) option is to
>>> improve these tools so that they're usable inside do_fetch (e.g.
>>> fulfil the caching/reproducibility criteria for a bitbake fetcher),
>>> and the needed changes are acceptable to upstreams.
>>
>> The real problem is the different philosophy between OE and the
>> package manager. The package manager doesn't care about duplicate
>> versions, maintenance versions, version updates of indirect
>> dependencies, license compliance, CVE checks or dead code (examples,
>> documentations, test,
>> ...) and if they care every package manager have its own solution.
> 
> The Java OSS ecosystem has had fairly well defined build time and
> runtime dependency metadata commonly available for many years now. The
> more widely used build tools are capable of using this metadata to
> calculate a solution Set(lib, version) that satisfies the version
> requirements of the dependency graph - or report that no solution is
> available. Cryptographic hashes are however by default provided by the
> artifact server(s).
> 
> For Python distribution packages, the 2020 pip resolver is advertised
> to be better at calculating solutions to required package
> dependencies. Cryptographic hashes are specified for pip, use is
> optional. The PyPA Core metadata includes some potentially useful
> fields, such as the license and what the project requires.
> 
> The language specific package managers could potentially be used to
> export dependency metadata, and also potentially to check "with (foo
> i.j.k) and (bar x.y.z), is there a set of compatible package versions
> available for baz and it's dependencies?"

This is possible and already done in recipetool.

I see two possible solution to support language specific package managers:

a) Use the package manager manifest direct (go.sum, npm-shrinkwrap.json, 
...) to fetch the dependencies.

b) Create recipes from the package manager manifest.

Either we give away the control over the dependencies and use the 
language specific way of manage dependencies or we have to extract the 
information as suggestion.

Everything between means that we lose the benefits from the language 
specific package manager and OE together.

Regard
   Stefan

Patch

diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
index b709777f243..e4c6c983b59 100644
--- a/meta/classes/base.bbclass
+++ b/meta/classes/base.bbclass
@@ -214,6 +214,7 @@  python create_source_date_epoch_stamp() {
     oe.reproducible.epochfile_write(source_date_epoch, d.getVar('SDE_FILE'), d)
 }
 do_unpack[postfuncs] += "create_source_date_epoch_stamp"
+do_unpack[nonetwork] = "1"
 
 def get_source_date_epoch_value(d):
     return oe.reproducible.epochfile_read(d.getVar('SDE_FILE'), d)
@@ -358,6 +359,7 @@  base_do_configure() {
 		echo ${BB_TASKHASH} > ${CONFIGURESTAMPFILE}
 	fi
 }
+do_configure[nonetwork] = "1"
 
 addtask compile after do_configure
 do_compile[dirs] = "${B}"
@@ -368,11 +370,13 @@  base_do_compile() {
 		bbnote "nothing to compile"
 	fi
 }
+do_compile[nonetwork] = "1"
 
 addtask install after do_compile
 do_install[dirs] = "${B}"
 # Remove and re-create ${D} so that is it guaranteed to be empty
 do_install[cleandirs] = "${D}"
+do_install[nonetwork] = "1"
 
 base_do_install() {
 	:
diff --git a/meta/classes/patch.bbclass b/meta/classes/patch.bbclass
index 8de70254919..57aaf7c31d1 100644
--- a/meta/classes/patch.bbclass
+++ b/meta/classes/patch.bbclass
@@ -164,6 +164,7 @@  patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
 
 addtask patch after do_unpack
 do_patch[dirs] = "${WORKDIR}"
+do_patch[nonetwork] = "1"
 do_patch[depends] = "${PATCHDEPENDENCY}"
 
 EXPORT_FUNCTIONS do_patch