diff mbox series

Underscore vs Dash in bbclass filenames

Message ID f8f81126-3fec-4dbd-8088-64d615879167@bigsur.com
State New
Headers show
Series Underscore vs Dash in bbclass filenames | expand

Commit Message

Saul Wold Feb. 7, 2024, 10:21 p.m. UTC
I was looking into Bug 14235 [0] and had initial patch based on Randy's 
original recommendation of using dash (-) which was a basic 
python/tinfoil script to check existing bbclasses in the BBPATH.

It was pointed out that the bbclass name is prepended to exported 
function and that would cause problems for SHELL functions as dash is 
not allowed in the shell function name.

On further investigation this of course becomes a more nuanced problem, 
so I considered an alternative approach for setting up a warning in 
bitbake itself.  There is currently an error check in bitbake's ast.py 
ExportFuncsNode() code. The code below would warnonce() for other 
bbclasses that have a dash and are not in a NO_WARN_BBCLASSES list.

At some point in the future this warning can be changed to an error.
I also noticed that the Regular Expressions defined in ast.py could be 
tightened up slightly, there are a couple that use \s (general string) 
vs \w which limits some special symbols, in order to limit dashes from 
appearing in function names in bbclasses.

This is not for the current Scarthgap release.

          if not fn in __inherit_cache:


Thoughts from the bitbake team on this approach?


[0] https://bugzilla.yoctoproject.org/show_bug.cgi?id=14235

Sau!

Comments

Enrico Scholz Feb. 8, 2024, 3:39 p.m. UTC | #1
"Saul Wold" <sgw@bigsur.com> writes:

> I was looking into Bug 14235 [0] and had initial patch based on
> Randy's original recommendation of using dash (-) which was a basic
> python/tinfoil script to check existing bbclasses in the BBPATH.

why not make it like rust and replace '-' by '_' when generating symbols?


Enrico
Peter Kjellerstedt Feb. 8, 2024, 9:09 p.m. UTC | #2
> -----Original Message-----
> From: bitbake-devel@lists.openembedded.org <bitbake-
> devel@lists.openembedded.org> On Behalf Of Enrico Scholz via
> lists.openembedded.org
> Sent: den 8 februari 2024 16:39
> To: Saul Wold <sgw@bigsur.com>
> Cc: bitbake-devel@lists.openembedded.org;
> richard.purdie@linuxfoundation.org; randy.macleod@windriver.com
> Subject: Re: [bitbake-devel] Underscore vs Dash in bbclass filenames
> 
> "Saul Wold" <sgw@bigsur.com> writes:
> 
> > I was looking into Bug 14235 [0] and had initial patch based on
> > Randy's original recommendation of using dash (-) which was a basic
> > python/tinfoil script to check existing bbclasses in the BBPATH.
> 
> why not make it like rust and replace '-' by '_' when generating symbols?
> 
> Enrico

Taken alone that would open up for problems in case there is both a 
foo_bar.bbclass and a foo-bar.bbclass (which is kind of ridiculous, but 
it is at least a theoretical problem). However, if you combine it with 
the rule that classes should only use dashes in their names, then it 
should work.

Actually, if we want bbclass names to only use dashes, what I would suggest 
(and this may sound odd to some given the recent discussion on breaking 
changes that I have been involved in on the openembedded-architecture list) 
would be to change the bitbake syntax to only allow alphanumeric characters 
and dashes in bbclass names. This may sound like a huge breaking change, but 
it actually isn't. As long as Enrico's suggestion is implemented, it is then 
just to change all class names in OE Core, meta-openembedded, etc. Yes, this 
will break other layers, but changing them is trivial. And if one has layers 
that need to work both with nanbield and master for a transition period (like 
we would have), it is easy to just add some oneliner wrapper classes in one 
of one's own layers for the new names that inherit the old classes.

At least I find that a lot more appealing than adding some check script with 
a lot of hardcoded exceptions for the currently existing classes.

I realize it is probably too late to do something like this for Scarthgap, 
but I think it should be considered for Scarthgap's successor (what ever it 
will be called).

//Peter
Peter Kjellerstedt Feb. 8, 2024, 9:16 p.m. UTC | #3
> -----Original Message-----
> From: bitbake-devel@lists.openembedded.org <bitbake-
> devel@lists.openembedded.org> On Behalf Of Peter Kjellerstedt
> Sent: den 8 februari 2024 22:09
> To: enrico.scholz@sigma-chemnitz.de; Saul Wold <sgw@bigsur.com>
> Cc: bitbake-devel@lists.openembedded.org;
> richard.purdie@linuxfoundation.org; randy.macleod@windriver.com
> Subject: Re: [bitbake-devel] Underscore vs Dash in bbclass filenames
> 
> > -----Original Message-----
> > From: bitbake-devel@lists.openembedded.org <bitbake-
> > devel@lists.openembedded.org> On Behalf Of Enrico Scholz via
> > lists.openembedded.org
> > Sent: den 8 februari 2024 16:39
> > To: Saul Wold <sgw@bigsur.com>
> > Cc: bitbake-devel@lists.openembedded.org;
> > richard.purdie@linuxfoundation.org; randy.macleod@windriver.com
> > Subject: Re: [bitbake-devel] Underscore vs Dash in bbclass filenames
> >
> > "Saul Wold" <sgw@bigsur.com> writes:
> >
> > > I was looking into Bug 14235 [0] and had initial patch based on
> > > Randy's original recommendation of using dash (-) which was a basic
> > > python/tinfoil script to check existing bbclasses in the BBPATH.
> >
> > why not make it like rust and replace '-' by '_' when generating
> > symbols?
> >
> > Enrico
> 
> Taken alone that would open up for problems in case there is both a
> foo_bar.bbclass and a foo-bar.bbclass (which is kind of ridiculous, but
> it is at least a theoretical problem). However, if you combine it with
> the rule that classes should only use dashes in their names, then it
> should work.
> 
> Actually, if we want bbclass names to only use dashes, what I would
> suggest (and this may sound odd to some given the recent discussion on 
> breaking changes that I have been involved in on the 
> openembedded-architecture list) would be to change the bitbake syntax 
> to only allow alphanumeric characters and dashes in bbclass names. This 
> may sound like a huge breaking change, but it actually isn't. As long 
> as Enrico's suggestion is implemented, it is then just to change all 
> class names in OE Core, meta-openembedded, etc. Yes, this will break 
> other layers, but changing them is trivial. And if one has layers that 
> need to work both with nanbield and master for a transition period (like
> we would have), it is easy to just add some oneliner wrapper classes in
> one of one's own layers for the new names that inherit the old classes.
> 
> At least I find that a lot more appealing than adding some check script
> with a lot of hardcoded exceptions for the currently existing classes.
> 
> I realize it is probably too late to do something like this for Scarthgap,
> but I think it should be considered for Scarthgap's successor (what ever
> it will be called).
> 
> //Peter

I forgot to say that regardless of whether the syntax is changed or not, 
I would really like to see Enrico's suggestion implemented for Scarthgap.
That would make it a lot easier to transition to the new naming of 
bbclasses even if the new naming policy isn't a hard rule in Scarthgap.

//Peter
Sam Liddicott Feb. 8, 2024, 9:21 p.m. UTC | #4
Whatever you do I wish that there is some simple text-only non-python
ini-type declaration file to declare these settings, rules & capabilities.

My meta-build-tools have to parse the python or look for specific git
commits to decide whether or not to use : or _ as the override separator in
the generated conf files (e.g. local.conf), and more of this sort of thing
is going to be painful for the next decade. As an example of the
worst case, tools can't tell if the "current distro" is a sanity-supported
distro without first running bitbake.

I don't think I can even tell what the release code-name is without looking
at git revisions in the oe-embedded layer.

Sam

On Thu, 8 Feb 2024 at 21:16, Peter Kjellerstedt <peter.kjellerstedt@axis.com>
wrote:

> > -----Original Message-----
> > From: bitbake-devel@lists.openembedded.org <bitbake-
> > devel@lists.openembedded.org> On Behalf Of Peter Kjellerstedt
> > Sent: den 8 februari 2024 22:09
> > To: enrico.scholz@sigma-chemnitz.de; Saul Wold <sgw@bigsur.com>
> > Cc: bitbake-devel@lists.openembedded.org;
> > richard.purdie@linuxfoundation.org; randy.macleod@windriver.com
> > Subject: Re: [bitbake-devel] Underscore vs Dash in bbclass filenames
> >
> > > -----Original Message-----
> > > From: bitbake-devel@lists.openembedded.org <bitbake-
> > > devel@lists.openembedded.org> On Behalf Of Enrico Scholz via
> > > lists.openembedded.org
> > > Sent: den 8 februari 2024 16:39
> > > To: Saul Wold <sgw@bigsur.com>
> > > Cc: bitbake-devel@lists.openembedded.org;
> > > richard.purdie@linuxfoundation.org; randy.macleod@windriver.com
> > > Subject: Re: [bitbake-devel] Underscore vs Dash in bbclass filenames
> > >
> > > "Saul Wold" <sgw@bigsur.com> writes:
> > >
> > > > I was looking into Bug 14235 [0] and had initial patch based on
> > > > Randy's original recommendation of using dash (-) which was a basic
> > > > python/tinfoil script to check existing bbclasses in the BBPATH.
> > >
> > > why not make it like rust and replace '-' by '_' when generating
> > > symbols?
> > >
> > > Enrico
> >
> > Taken alone that would open up for problems in case there is both a
> > foo_bar.bbclass and a foo-bar.bbclass (which is kind of ridiculous, but
> > it is at least a theoretical problem). However, if you combine it with
> > the rule that classes should only use dashes in their names, then it
> > should work.
> >
> > Actually, if we want bbclass names to only use dashes, what I would
> > suggest (and this may sound odd to some given the recent discussion on
> > breaking changes that I have been involved in on the
> > openembedded-architecture list) would be to change the bitbake syntax
> > to only allow alphanumeric characters and dashes in bbclass names. This
> > may sound like a huge breaking change, but it actually isn't. As long
> > as Enrico's suggestion is implemented, it is then just to change all
> > class names in OE Core, meta-openembedded, etc. Yes, this will break
> > other layers, but changing them is trivial. And if one has layers that
> > need to work both with nanbield and master for a transition period (like
> > we would have), it is easy to just add some oneliner wrapper classes in
> > one of one's own layers for the new names that inherit the old classes.
> >
> > At least I find that a lot more appealing than adding some check script
> > with a lot of hardcoded exceptions for the currently existing classes.
> >
> > I realize it is probably too late to do something like this for
> Scarthgap,
> > but I think it should be considered for Scarthgap's successor (what ever
> > it will be called).
> >
> > //Peter
>
> I forgot to say that regardless of whether the syntax is changed or not,
> I would really like to see Enrico's suggestion implemented for Scarthgap.
> That would make it a lot easier to transition to the new naming of
> bbclasses even if the new naming policy isn't a hard rule in Scarthgap.
>
> //Peter
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#15846):
> https://lists.openembedded.org/g/bitbake-devel/message/15846
> Mute This Topic: https://lists.openembedded.org/mt/104228800/7497305
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [
> sam@liddicott.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
Richard Purdie Feb. 9, 2024, 4:54 p.m. UTC | #5
On Thu, 2024-02-08 at 16:39 +0100, Enrico Scholz wrote:
> "Saul Wold" <sgw@bigsur.com> writes:
> 
> > I was looking into Bug 14235 [0] and had initial patch based on
> > Randy's original recommendation of using dash (-) which was a basic
> > python/tinfoil script to check existing bbclasses in the BBPATH.
> 
> why not make it like rust and replace '-' by '_' when generating symbols?

Symbol mangling works ok since the user never needs to reference the
output version.

In our case you could have a class with one format and function names
using both dashes and underscores. I think that could get very
confusing since you could declare it with one name and call it with
another.

There could be a case for assisting a transition with some kind of
translation but once you limit the usage like this, I think it may be
better just to ask users to convert.

This assumes we do need to "fix" this problem, which I remain unsure
about.

Cheers,

Richard
Richard Purdie Feb. 9, 2024, 6:05 p.m. UTC | #6
Hi Saul,

Thanks for looking at this.

On Wed, 2024-02-07 at 14:21 -0800, Saul Wold wrote:
> I was looking into Bug 14235 [0] and had initial patch based on Randy's 
> original recommendation of using dash (-) which was a basic 
> python/tinfoil script to check existing bbclasses in the BBPATH.
> 
> It was pointed out that the bbclass name is prepended to exported 
> function and that would cause problems for SHELL functions as dash is 
> not allowed in the shell function name.
> 
> On further investigation this of course becomes a more nuanced problem, 
> so I considered an alternative approach for setting up a warning in 
> bitbake itself.  There is currently an error check in bitbake's ast.py 
> ExportFuncsNode() code. The code below would warnonce() for other 
> bbclasses that have a dash and are not in a NO_WARN_BBCLASSES list.
> 
> At some point in the future this warning can be changed to an error.
> I also noticed that the Regular Expressions defined in ast.py could be 
> tightened up slightly, there are a couple that use \s (general string) 
> vs \w which limits some special symbols, in order to limit dashes from 
> appearing in function names in bbclasses.

Was this for python or shell functions or both? We should probably
tighten the shell case at the very least.

> 
> This is not for the current Scarthgap release.
> 
> diff --git a/bitbake/lib/bb/parse/parse_py/BBHandler.py 
> b/bitbake/lib/bb/parse/parse_py/BBHandler.py
> index cd1c998f8f8..faf6e9ede63 100644
> --- a/bitbake/lib/bb/parse/parse_py/BBHandler.py
> +++ b/bitbake/lib/bb/parse/parse_py/BBHandler.py
> @@ -116,6 +116,9 @@ def handle(fn, d, include, baseconfig=False):
>       init(d)
> 
>       if ext == ".bbclass":
> +        nowarn_bbclass_list = d.getVar('NOWARN_DASH_BBCLASSES')
> +        if '-' in root and root not in nowarn_bbclass_list:
> +            bb.warnonce("%s contains dash ('-') in name which is not 
> safe" % base_name)
>           __classname__ = root
>           __inherit_cache = d.getVar('__inherit_cache', False) or []
>           if not fn in __inherit_cache:
> 
> 
> Thoughts from the bitbake team on this approach?
> 
> 
> [0] https://bugzilla.yoctoproject.org/show_bug.cgi?id=14235

If we are going to warn about something, it would have to be the dash.
This kind of patch would seem to be roughly the right way to handle it.

My main question is whether we do want to force users through this
transition or just stop new classes? Does the inconsistency really
matter? Do we need to do anything, is there a problem we need to solve
here?

If we do, do we support a set of "remapped" classes, where for example
"inherit create-spdx" would translate into "inherit create_spdx" ? That
does still have potential function name issues but they probably don't
work today with dashes in and any class using those did already
probably switch to an underscore.

Cheers,

Richard
Randy MacLeod Feb. 9, 2024, 9:13 p.m. UTC | #7
On 2024-02-09 1:05 p.m., Richard Purdie wrote:
> Hi Saul,
>
> Thanks for looking at this.
>
> On Wed, 2024-02-07 at 14:21 -0800, Saul Wold wrote:
>> I was looking into Bug 14235 [0] and had initial patch based on Randy's
>> original recommendation of using dash (-) which was a basic
>> python/tinfoil script to check existing bbclasses in the BBPATH.
>>
>> It was pointed out that the bbclass name is prepended to exported
>> function and that would cause problems for SHELL functions as dash is
>> not allowed in the shell function name.
>>
>> On further investigation this of course becomes a more nuanced problem,
>> so I considered an alternative approach for setting up a warning in
>> bitbake itself.  There is currently an error check in bitbake's ast.py
>> ExportFuncsNode() code. The code below would warnonce() for other
>> bbclasses that have a dash and are not in a NO_WARN_BBCLASSES list.
>>
>> At some point in the future this warning can be changed to an error.
>> I also noticed that the Regular Expressions defined in ast.py could be
>> tightened up slightly, there are a couple that use \s (general string)
>> vs \w which limits some special symbols, in order to limit dashes from
>> appearing in function names in bbclasses.
> Was this for python or shell functions or both? We should probably
> tighten the shell case at the very least.
>
>> This is not for the current Scarthgap release.
>>
>> diff --git a/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> b/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> index cd1c998f8f8..faf6e9ede63 100644
>> --- a/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> +++ b/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> @@ -116,6 +116,9 @@ def handle(fn, d, include, baseconfig=False):
>>        init(d)
>>
>>        if ext == ".bbclass":
>> +        nowarn_bbclass_list = d.getVar('NOWARN_DASH_BBCLASSES')
>> +        if '-' in root and root not in nowarn_bbclass_list:
>> +            bb.warnonce("%s contains dash ('-') in name which is not
>> safe" % base_name)
>>            __classname__ = root
>>            __inherit_cache = d.getVar('__inherit_cache', False) or []
>>            if not fn in __inherit_cache:
>>
>>
>> Thoughts from the bitbake team on this approach?
I reported the bug just because I noticed the inconsistency and
thought it could be annoying to people.
>>
>>
>> [0]https://bugzilla.yoctoproject.org/show_bug.cgi?id=14235
> If we are going to warn about something, it would have to be the dash.
> This kind of patch would seem to be roughly the right way to handle it.
Fine with me.
>
> My main question is whether we do want to force users through this
> transition or just stop new classes? Does the inconsistency really
> matter? Do we need to do anything, is there a problem we need to solve
> here?


It was just something  I noticed as being inconsistent.

I'm fine with trying to make only new bbclass files more consistent as a 
starting point
given that having all of one or the other separator is problematic.

Thanks for checking it out Saul!

../Randy


>
> If we do, do we support a set of "remapped" classes, where for example
> "inherit create-spdx" would translate into "inherit create_spdx" ? That
> does still have potential function name issues but they probably don't
> work today with dashes in and any class using those did already
> probably switch to an underscore.






>
> Cheers,
>
> Richard
>
>
Saul Wold Feb. 10, 2024, 12:56 a.m. UTC | #8
On 2/9/24 10:05, Richard Purdie wrote:
> Hi Saul,
> 
> Thanks for looking at this.
> 
No Problem, I grabbed it at Randy's suggestion!

> On Wed, 2024-02-07 at 14:21 -0800, Saul Wold wrote:
>> I was looking into Bug 14235 [0] and had initial patch based on Randy's
>> original recommendation of using dash (-) which was a basic
>> python/tinfoil script to check existing bbclasses in the BBPATH.
>>
>> It was pointed out that the bbclass name is prepended to exported
>> function and that would cause problems for SHELL functions as dash is
>> not allowed in the shell function name.
>>
>> On further investigation this of course becomes a more nuanced problem,
>> so I considered an alternative approach for setting up a warning in
>> bitbake itself.  There is currently an error check in bitbake's ast.py
>> ExportFuncsNode() code. The code below would warnonce() for other
>> bbclasses that have a dash and are not in a NO_WARN_BBCLASSES list.
>>
>> At some point in the future this warning can be changed to an error.
>> I also noticed that the Regular Expressions defined in ast.py could be
>> tightened up slightly, there are a couple that use \s (general string)
>> vs \w which limits some special symbols, in order to limit dashes from
>> appearing in function names in bbclasses.
> 
> Was this for python or shell functions or both? We should probably
> tighten the shell case at the very least.
>
I would consider it both.

Oops, BBHandler.py has the re.compiles(), and I totally confused '\s' is 
whitespace, not string! Not sure what I was thinking, I had a python 
regex page open even!

__export_func_regexp__   = re.compile(r"EXPORT_FUNCTIONS\s+(.+)" )
probably should use \w+ to grab the EXPORT_FUNCTIONS function names.

__addtask_regexp__       = 
re.compile(r"addtask\s+(?P<func>\w+)\s*((before\s*(?P<before>((.*(?=after))|(.*))))|(after\s*(?P<after>((.*(?=before))|(.*)))))*")
Here the \w+ is used to capture the "addtask" function list.

deltask_regexg uses .+ instead of the tighter \w+ also.

If we do rename all dash bbclasses, then the inherit_regexp can change 
from .+ to \w+ also I think, but that's would be future!

>>
>> This is not for the current Scarthgap release.
>>
>> diff --git a/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> b/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> index cd1c998f8f8..faf6e9ede63 100644
>> --- a/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> +++ b/bitbake/lib/bb/parse/parse_py/BBHandler.py
>> @@ -116,6 +116,9 @@ def handle(fn, d, include, baseconfig=False):
>>        init(d)
>>
>>        if ext == ".bbclass":
>> +        nowarn_bbclass_list = d.getVar('NOWARN_DASH_BBCLASSES')
>> +        if '-' in root and root not in nowarn_bbclass_list:
>> +            bb.warnonce("%s contains dash ('-') in name which is not
>> safe" % base_name)
>>            __classname__ = root
>>            __inherit_cache = d.getVar('__inherit_cache', False) or []
>>            if not fn in __inherit_cache:
>>
>>
>> Thoughts from the bitbake team on this approach?
>>
>>
>> [0] https://bugzilla.yoctoproject.org/show_bug.cgi?id=14235
> 
> If we are going to warn about something, it would have to be the dash.
> This kind of patch would seem to be roughly the right way to handle it.
> 
> My main question is whether we do want to force users through this
> transition or just stop new classes? Does the inconsistency really
> matter? Do we need to do anything, is there a problem we need to solve
> here?
> 
If we try to stop new classes, we need to keep a list of "grandfathered" 
classes which is the NOWARN_DASH_BBCLASSES variable in my proposal above.

I can post this with an associated patch to oe-core and 
meta-openembbeded layers, other layers would need to add their own 
NOWARN_DASH_BBCLASSES +=  the extend the list.

I am not sure if this would require a version bump which is why I think 
it needs to wait until after Scarthgap.

> If we do, do we support a set of "remapped" classes, where for example
> "inherit create-spdx" would translate into "inherit create_spdx" ? That
> does still have potential function name issues but they probably don't
> work today with dashes in and any class using those did already
> probably switch to an underscore.
> 
Honestly, I would rather not try to deal with remapping classes or 
function names.  As you point out any bbclass that has a shell function 
is already using either an underscore or single name (probably due to 
the error check you added).

This would just help make future naming more consistent with underscore 
or single name as Randy points.

If there is an opening on Tuesday and I make it, I will bring it.

Sau!

> Cheers,
> 
> Richard
> 
>
Saul Wold Feb. 13, 2024, 6:35 p.m. UTC | #9
Following up on this after the Tech Meeting today.  The discussion 
centered around if any function names (python or shell) contain invalid 
characters such a dash (-) and issue a warning or error.

The current __func_start_regexp__ in BBHandler.py is defined as follows:

__func_start_regexp__    = 
re.compile(r"(((?P<py>python(?=(\s|\()))|(?P<fr>fakeroot(?=\s)))\s*)*(?P<func>[\w\.\-\+\{\}\$:]+)?\s*\(\s*\)\s*{$" 
)

Note the sub-pattern for the function name: (?P<func>[\w\.\-\+\{\}\$:]+)
It includes '.', '-', '+' and '$' which are all invalid for shell and 
python function names.  The \w expression which matches [a-zA-Z0-9_] is 
correct for both shell and python functions.

If an invalid character is introduced in a function name with the a 
modified regular expression that removes the invalid characters the 
following error will occur:

ERROR: Unable to parse 
/home/swold/src/yocto/poky/bitbake/lib/bb/parse/parse_py/ConfHandler.py
Traceback (most recent call last):
   File 
"/home/swold/src/yocto/poky/bitbake/lib/bb/parse/parse_py/ConfHandler.py", 
line 200, in feeder(lineno=7, s='python dash-test-python-function() {', 
fn='/home/swold/src/yocto/poky/meta/classes/underscore_bbclass.bbclass', 
statements=[<bb.parse.ast.MethodNode object at 0x7fe6a2dbc700>, 
<bb.parse.ast.AddTaskNode object at 0x7fe6a2dbc5b0>], baseconfig=False, 
conffile=False):

     >    raise ParseError("unparsed line: '%s'" % s, fn, lineno);

bb.parse.ParseError: ParseError at 
/home/swold/src/yocto/poky/meta/classes/underscore_bbclass.bbclass:7: 
unparsed line: 'python dash-test-python-function() {'

With that limited investigation, I realized that there are some 'class' 
specific functions like a do_write_config:append:class-target() or 
do_compile:class-native() which would fail the RE match() in BBHandler.py.

I think more investigation is needed on how those are parsed and handled 
in bitbake (I admit to not being an expert on bitbake internals).

Sau!


On 2/9/24 16:56, Saul Wold wrote:
> 
> 
> On 2/9/24 10:05, Richard Purdie wrote:
>> Hi Saul,
>>
>> Thanks for looking at this.
>>
> No Problem, I grabbed it at Randy's suggestion!
> 
>> On Wed, 2024-02-07 at 14:21 -0800, Saul Wold wrote:
>>> I was looking into Bug 14235 [0] and had initial patch based on Randy's
>>> original recommendation of using dash (-) which was a basic
>>> python/tinfoil script to check existing bbclasses in the BBPATH.
>>>
>>> It was pointed out that the bbclass name is prepended to exported
>>> function and that would cause problems for SHELL functions as dash is
>>> not allowed in the shell function name.
>>>
>>> On further investigation this of course becomes a more nuanced problem,
>>> so I considered an alternative approach for setting up a warning in
>>> bitbake itself.  There is currently an error check in bitbake's ast.py
>>> ExportFuncsNode() code. The code below would warnonce() for other
>>> bbclasses that have a dash and are not in a NO_WARN_BBCLASSES list.
>>>
>>> At some point in the future this warning can be changed to an error.
>>> I also noticed that the Regular Expressions defined in ast.py could be
>>> tightened up slightly, there are a couple that use \s (general string)
>>> vs \w which limits some special symbols, in order to limit dashes from
>>> appearing in function names in bbclasses.
>>
>> Was this for python or shell functions or both? We should probably
>> tighten the shell case at the very least.
>>
> I would consider it both.
> 
> Oops, BBHandler.py has the re.compiles(), and I totally confused '\s' is 
> whitespace, not string! Not sure what I was thinking, I had a python 
> regex page open even!
> 
> __export_func_regexp__   = re.compile(r"EXPORT_FUNCTIONS\s+(.+)" )
> probably should use \w+ to grab the EXPORT_FUNCTIONS function names.
> 
> __addtask_regexp__       = 
> re.compile(r"addtask\s+(?P<func>\w+)\s*((before\s*(?P<before>((.*(?=after))|(.*))))|(after\s*(?P<after>((.*(?=before))|(.*)))))*")
> Here the \w+ is used to capture the "addtask" function list.
> 
> deltask_regexg uses .+ instead of the tighter \w+ also.
> 
> If we do rename all dash bbclasses, then the inherit_regexp can change 
> from .+ to \w+ also I think, but that's would be future!
> 
>>>
>>> This is not for the current Scarthgap release.
>>>
>>> diff --git a/bitbake/lib/bb/parse/parse_py/BBHandler.py
>>> b/bitbake/lib/bb/parse/parse_py/BBHandler.py
>>> index cd1c998f8f8..faf6e9ede63 100644
>>> --- a/bitbake/lib/bb/parse/parse_py/BBHandler.py
>>> +++ b/bitbake/lib/bb/parse/parse_py/BBHandler.py
>>> @@ -116,6 +116,9 @@ def handle(fn, d, include, baseconfig=False):
>>>        init(d)
>>>
>>>        if ext == ".bbclass":
>>> +        nowarn_bbclass_list = d.getVar('NOWARN_DASH_BBCLASSES')
>>> +        if '-' in root and root not in nowarn_bbclass_list:
>>> +            bb.warnonce("%s contains dash ('-') in name which is not
>>> safe" % base_name)
>>>            __classname__ = root
>>>            __inherit_cache = d.getVar('__inherit_cache', False) or []
>>>            if not fn in __inherit_cache:
>>>
>>>
>>> Thoughts from the bitbake team on this approach?
>>>
>>>
>>> [0] https://bugzilla.yoctoproject.org/show_bug.cgi?id=14235
>>
>> If we are going to warn about something, it would have to be the dash.
>> This kind of patch would seem to be roughly the right way to handle it.
>>
>> My main question is whether we do want to force users through this
>> transition or just stop new classes? Does the inconsistency really
>> matter? Do we need to do anything, is there a problem we need to solve
>> here?
>>
> If we try to stop new classes, we need to keep a list of "grandfathered" 
> classes which is the NOWARN_DASH_BBCLASSES variable in my proposal above.
> 
> I can post this with an associated patch to oe-core and 
> meta-openembbeded layers, other layers would need to add their own 
> NOWARN_DASH_BBCLASSES +=  the extend the list.
> 
> I am not sure if this would require a version bump which is why I think 
> it needs to wait until after Scarthgap.
> 
>> If we do, do we support a set of "remapped" classes, where for example
>> "inherit create-spdx" would translate into "inherit create_spdx" ? That
>> does still have potential function name issues but they probably don't
>> work today with dashes in and any class using those did already
>> probably switch to an underscore.
>>
> Honestly, I would rather not try to deal with remapping classes or 
> function names.  As you point out any bbclass that has a shell function 
> is already using either an underscore or single name (probably due to 
> the error check you added).
> 
> This would just help make future naming more consistent with underscore 
> or single name as Randy points.
> 
> If there is an opening on Tuesday and I make it, I will bring it.
> 
> Sau!
> 
>> Cheers,
>>
>> Richard
>>
>>
Peter Kjellerstedt Feb. 14, 2024, 5:52 a.m. UTC | #10
> -----Original Message-----
> From: bitbake-devel@lists.openembedded.org <bitbake-
> devel@lists.openembedded.org> On Behalf Of Saul Wold
> Sent: den 13 februari 2024 19:35
> To: Richard Purdie <richard.purdie@linuxfoundation.org>; bitbake-
> devel@lists.openembedded.org
> Cc: randy.macleod@windriver.com
> Subject: Re: [bitbake-devel] Underscore vs Dash in bbclass filenames
> 
> Following up on this after the Tech Meeting today.  The discussion
> centered around if any function names (python or shell) contain invalid
> characters such a dash (-) and issue a warning or error.
> 
> The current __func_start_regexp__ in BBHandler.py is defined as follows:
> 
> __func_start_regexp__    = re.compile(r"(((?P<py>python(?=(\s|\()))|(?P<fr>fakeroot(?=\s)))\s*)*(?P<func>[\w\.\-\+\{\}\$:]+)?\s*\(\s*\)\s*{$")
> 
> Note the sub-pattern for the function name: (?P<func>[\w\.\-\+\{\}\$:]+)
> It includes '.', '-', '+' and '$' which are all invalid for shell and
> python function names.  The \w expression which matches [a-zA-Z0-9_] is
> correct for both shell and python functions.
> 
> If an invalid character is introduced in a function name with the a
> modified regular expression that removes the invalid characters the
> following error will occur:
> 
> ERROR: Unable to parse
> /home/swold/src/yocto/poky/bitbake/lib/bb/parse/parse_py/ConfHandler.py
> Traceback (most recent call last):
>    File
> "/home/swold/src/yocto/poky/bitbake/lib/bb/parse/parse_py/ConfHandler.py",
> line 200, in feeder(lineno=7, s='python dash-test-python-function() {',
> fn='/home/swold/src/yocto/poky/meta/classes/underscore_bbclass.bbclass',
> statements=[<bb.parse.ast.MethodNode object at 0x7fe6a2dbc700>,
> <bb.parse.ast.AddTaskNode object at 0x7fe6a2dbc5b0>], baseconfig=False,
> conffile=False):
> 
>      >    raise ParseError("unparsed line: '%s'" % s, fn, lineno);
> 
> bb.parse.ParseError: ParseError at
> /home/swold/src/yocto/poky/meta/classes/underscore_bbclass.bbclass:7:
> unparsed line: 'python dash-test-python-function() {'
> 
> With that limited investigation, I realized that there are some 'class'
> specific functions like a do_write_config:append:class-target() or
> do_compile:class-native() which would fail the RE match() in BBHandler.py.
> 
> I think more investigation is needed on how those are parsed and handled
> in bitbake (I admit to not being an expert on bitbake internals).
> 
> Sau!

Going back in time, the following three commits added support for this:

0ded6c7ba6ba27c1fc0d1c7c2adbeda673ebd15c    allow + and - in function names
8bf780c087df9dca19c9d16739731eca9f6e6bc9    tolerate ${...} in function names
3d7348c9eb7deebecce594f005f0019f9139e51c    bitbake/lib/bb/parse/parse_py/BBHandler.py:
            -Patch by pHilipp Zabel to allow dots
    in the function names. This fixes bug #139. It seems right
    to have dots in the packagename

(the commit messages were not very descriptive back then...)

Not sure what use `+` and  `.` are, but `-` is definitely needed. Likewise 
for ${...}, which, e.g., is used to define package specific functions such 
as pkg_postinst:${PN}().

//Peter

> On 2/9/24 16:56, Saul Wold wrote:
> >
> >
> > On 2/9/24 10:05, Richard Purdie wrote:
> >> Hi Saul,
> >>
> >> Thanks for looking at this.
> >>
> > No Problem, I grabbed it at Randy's suggestion!
> >
> >> On Wed, 2024-02-07 at 14:21 -0800, Saul Wold wrote:
> >>> I was looking into Bug 14235 [0] and had initial patch based on
> Randy's
> >>> original recommendation of using dash (-) which was a basic
> >>> python/tinfoil script to check existing bbclasses in the BBPATH.
> >>>
> >>> It was pointed out that the bbclass name is prepended to exported
> >>> function and that would cause problems for SHELL functions as dash is
> >>> not allowed in the shell function name.
> >>>
> >>> On further investigation this of course becomes a more nuanced
> problem,
> >>> so I considered an alternative approach for setting up a warning in
> >>> bitbake itself.  There is currently an error check in bitbake's ast.py
> >>> ExportFuncsNode() code. The code below would warnonce() for other
> >>> bbclasses that have a dash and are not in a NO_WARN_BBCLASSES list.
> >>>
> >>> At some point in the future this warning can be changed to an error.
> >>> I also noticed that the Regular Expressions defined in ast.py could be
> >>> tightened up slightly, there are a couple that use \s (general string)
> >>> vs \w which limits some special symbols, in order to limit dashes from
> >>> appearing in function names in bbclasses.
> >>
> >> Was this for python or shell functions or both? We should probably
> >> tighten the shell case at the very least.
> >>
> > I would consider it both.
> >
> > Oops, BBHandler.py has the re.compiles(), and I totally confused '\s' is
> > whitespace, not string! Not sure what I was thinking, I had a python
> > regex page open even!
> >
> > __export_func_regexp__   = re.compile(r"EXPORT_FUNCTIONS\s+(.+)" )
> > probably should use \w+ to grab the EXPORT_FUNCTIONS function names.
> >
> > __addtask_regexp__       =
> >
> re.compile(r"addtask\s+(?P<func>\w+)\s*((before\s*(?P<before>((.*(?=after)
> )|(.*))))|(after\s*(?P<after>((.*(?=before))|(.*)))))*")
> > Here the \w+ is used to capture the "addtask" function list.
> >
> > deltask_regexg uses .+ instead of the tighter \w+ also.
> >
> > If we do rename all dash bbclasses, then the inherit_regexp can change
> > from .+ to \w+ also I think, but that's would be future!
> >
> >>>
> >>> This is not for the current Scarthgap release.
> >>>
> >>> diff --git a/bitbake/lib/bb/parse/parse_py/BBHandler.py
> >>> b/bitbake/lib/bb/parse/parse_py/BBHandler.py
> >>> index cd1c998f8f8..faf6e9ede63 100644
> >>> --- a/bitbake/lib/bb/parse/parse_py/BBHandler.py
> >>> +++ b/bitbake/lib/bb/parse/parse_py/BBHandler.py
> >>> @@ -116,6 +116,9 @@ def handle(fn, d, include, baseconfig=False):
> >>>        init(d)
> >>>
> >>>        if ext == ".bbclass":
> >>> +        nowarn_bbclass_list = d.getVar('NOWARN_DASH_BBCLASSES')
> >>> +        if '-' in root and root not in nowarn_bbclass_list:
> >>> +            bb.warnonce("%s contains dash ('-') in name which is not
> >>> safe" % base_name)
> >>>            __classname__ = root
> >>>            __inherit_cache = d.getVar('__inherit_cache', False) or []
> >>>            if not fn in __inherit_cache:
> >>>
> >>>
> >>> Thoughts from the bitbake team on this approach?
> >>>
> >>>
> >>> [0] https://bugzilla.yoctoproject.org/show_bug.cgi?id=14235
> >>
> >> If we are going to warn about something, it would have to be the dash.
> >> This kind of patch would seem to be roughly the right way to handle it.
> >>
> >> My main question is whether we do want to force users through this
> >> transition or just stop new classes? Does the inconsistency really
> >> matter? Do we need to do anything, is there a problem we need to solve
> >> here?
> >>
> > If we try to stop new classes, we need to keep a list of "grandfathered"
> > classes which is the NOWARN_DASH_BBCLASSES variable in my proposal
> above.
> >
> > I can post this with an associated patch to oe-core and
> > meta-openembbeded layers, other layers would need to add their own
> > NOWARN_DASH_BBCLASSES +=  the extend the list.
> >
> > I am not sure if this would require a version bump which is why I think
> > it needs to wait until after Scarthgap.
> >
> >> If we do, do we support a set of "remapped" classes, where for example
> >> "inherit create-spdx" would translate into "inherit create_spdx" ? That
> >> does still have potential function name issues but they probably don't
> >> work today with dashes in and any class using those did already
> >> probably switch to an underscore.
> >>
> > Honestly, I would rather not try to deal with remapping classes or
> > function names.  As you point out any bbclass that has a shell function
> > is already using either an underscore or single name (probably due to
> > the error check you added).
> >
> > This would just help make future naming more consistent with underscore
> > or single name as Randy points.
> >
> > If there is an opening on Tuesday and I make it, I will bring it.
> >
> > Sau!
> >
> >> Cheers,
> >>
> >> Richard
> >>
> >>
diff mbox series

Patch

diff --git a/bitbake/lib/bb/parse/parse_py/BBHandler.py 
b/bitbake/lib/bb/parse/parse_py/BBHandler.py
index cd1c998f8f8..faf6e9ede63 100644
--- a/bitbake/lib/bb/parse/parse_py/BBHandler.py
+++ b/bitbake/lib/bb/parse/parse_py/BBHandler.py
@@ -116,6 +116,9 @@  def handle(fn, d, include, baseconfig=False):
      init(d)

      if ext == ".bbclass":
+        nowarn_bbclass_list = d.getVar('NOWARN_DASH_BBCLASSES')
+        if '-' in root and root not in nowarn_bbclass_list:
+            bb.warnonce("%s contains dash ('-') in name which is not 
safe" % base_name)
          __classname__ = root
          __inherit_cache = d.getVar('__inherit_cache', False) or []