Message ID | 20250129185531.3752682-1-n.merinov@inango-systems.com |
---|---|
State | Accepted, archived |
Commit | 93059aad13a12cd69d86368795c88e5349197d5d |
Headers | show |
Series | parse: Forbid ambiguous assignments to ${.}, ${+}, and ${:} variables | expand |
> -----Original Message----- > From: bitbake-devel@lists.openembedded.org <bitbake-devel@lists.openembedded.org> On Behalf Of Nikolay Merinov via lists.openembedded.org > Sent: den 29 januari 2025 19:56 > To: bitbake-devel@lists.openembedded.org > Cc: Nikolai Merinov <n.merinov@inango-systems.com> > Subject: [bitbake-devel] [PATCH] parse: Forbid ambiguous assignments to ${.}, ${+}, and ${:} variables > > Old code that parse variable names in assignment commands behave differntly for differntly -> differently > variables that ends with special symbol for single-character variable names and > multi-character variable names. For example: > > A+="1" # Change variable ${A}, '+' glued to '=' > A+ = "1" # Change variable ${A+} > > +="1" # Change variable ${+}, the '+' symbol not part of assignment operator > + = "1" # Change variable ${+} > > New code would always assume that '.=', '+=', and ':=' is assignment operator. > As result code like the following would raise parsing error > > +="value" > > While code with extra spaces would work as before > > + = "value" # Change variable ${+} > > This change allow to catch issues in code that generate bitbake configuration > files in a manner like "echo ${VARNAME}+=${VALUE} >> conf/local.conf" > > Signed-off-by: Nikolai Merinov <n.merinov@inango-systems.com> > --- > lib/bb/parse/parse_py/ConfHandler.py | 2 +- > lib/toaster/toastermain/management/commands/buildimport.py | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/bb/parse/parse_py/ConfHandler.py b/lib/bb/parse/parse_py/ConfHandler.py > index 24f81f7e9..bdac66004 100644 > --- a/lib/bb/parse/parse_py/ConfHandler.py > +++ b/lib/bb/parse/parse_py/ConfHandler.py > @@ -20,7 +20,7 @@ from bb.parse import ParseError, resolve_file, ast, logger, handle > __config_regexp__ = re.compile( r""" > ^ > (?P<exp>export\s+)? > - (?P<var>[a-zA-Z0-9\-_+.${}/~:]+?) > + (?P<var>([a-zA-Z0-9\-_${}/~] | [+.:](?!=))+?) > (\[(?P<flag>[a-zA-Z0-9\-_+.][a-zA-Z0-9\-_+.@/]*)\])? > > \s* ( > diff --git a/lib/toaster/toastermain/management/commands/buildimport.py b/lib/toaster/toastermain/management/commands/buildimport.py > index f7139aa04..37d27a445 100644 > --- a/lib/toaster/toastermain/management/commands/buildimport.py > +++ b/lib/toaster/toastermain/management/commands/buildimport.py > @@ -60,7 +60,7 @@ def _log(msg): > __config_regexp__ = re.compile( r""" > ^ > (?P<exp>export\s+)? > - (?P<var>[a-zA-Z0-9\-_+.${}/~]+?) > + (?P<var>([a-zA-Z0-9\-_${}/~] | [.+](?!=))+?) > (\[(?P<flag>[a-zA-Z0-9\-_+.]+)\])? > > \s* ( > -- > 2.34.1 > The above made me ponder on the existence of $, { and } in the regular expression for the variable name and what that can be (ab)used for. I realized that it allows you do define really odd variables such as: ${} = "foobar" } = "foobar" $ = "foobar" And while you cannot access them directly using, e.g., "${$}", you can access them using, e.g., "${@d.getVar('$')"... This also allows defining variables that do not match the syntax at all, e.g.: EQUAL = "=" ${EQUAL} = "bar" which will define the variable "=" and where "${@d.getVar('=')}" then returns "bar" as expected... I am not sure this is a problem or working as intended, but it is a bit odd. //Peter
On Wed, 2025-01-29 at 20:55 +0200, Nikolay Merinov via lists.openembedded.org wrote: > Old code that parse variable names in assignment commands behave differntly for > variables that ends with special symbol for single-character variable names and > multi-character variable names. For example: > > A+="1" # Change variable ${A}, '+' glued to '=' > A+ = "1" # Change variable ${A+} > > +="1" # Change variable ${+}, the '+' symbol not part of assignment operator > + = "1" # Change variable ${+} > > New code would always assume that '.=', '+=', and ':=' is assignment operator. > As result code like the following would raise parsing error > > +="value" > > While code with extra spaces would work as before > > + = "value" # Change variable ${+} > > This change allow to catch issues in code that generate bitbake configuration > files in a manner like "echo ${VARNAME}+=${VALUE} >> conf/local.conf" > > Signed-off-by: Nikolai Merinov <n.merinov@inango-systems.com> > --- > lib/bb/parse/parse_py/ConfHandler.py | 2 +- > lib/toaster/toastermain/management/commands/buildimport.py | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) This is an interesting bug, thanks for reporting it and sending a patch. I'm a little torn on what the right thing to do here is. I'm wondering if it is time to say that whitespace around the assignment operator is required. That will break metadata and I've found a few hundred places it would break. Equally, those can be fixed and we'd end up with a better end result as the whitespace does make recipe files more readable. I have a patch to fix most of the references in OE-Core. I'm worried that if we do what you patch does, we're complicating the code and potentially setting ourselves up for other new interesting future problems. I'd also love to see some test cases adding to lib/bb/tests/parse.py so that "bitbake-selftest bb.tests.parse" would detect these problems if we ever introduced them in future. Cheers, Richard
Hi Richard, > I'm worried that if we do what you patch does, we're complicating the > code and potentially setting ourselves up for other new interesting > future problems. How about moving complicated part out of regular expression? Please take a look to the second version of the patch. Regards, Nikolai
Hi Nikolai, On Mon, 2025-02-03 at 18:42 +0200, Nikolai Merinov wrote: > Hi Richard, > > > I'm worried that if we do what you patch does, we're complicating > > the > > code and potentially setting ourselves up for other new interesting > > future problems. > > How about moving complicated part out of regular expression? > > Please take a look to the second version of the patch. I prefer that version thanks! If we can tweak the test name, I think we can merge that and I'll continue to think about whether we should make the whitespace parsing stricter. Cheers, Richard
On Fri, 2025-01-31 at 20:40 +0000, Peter Kjellerstedt via lists.openembedded.org wrote: > > -----Original Message----- > > From: bitbake-devel@lists.openembedded.org <bitbake-devel@lists.openembedded.org> On Behalf Of Nikolay Merinov via lists.openembedded.org > > Sent: den 29 januari 2025 19:56 > > To: bitbake-devel@lists.openembedded.org > > Cc: Nikolai Merinov <n.merinov@inango-systems.com> > > Subject: [bitbake-devel] [PATCH] parse: Forbid ambiguous assignments to ${.}, ${+}, and ${:} variables > > > > Old code that parse variable names in assignment commands behave differntly for > > differntly -> differently > > > variables that ends with special symbol for single-character variable names and > > multi-character variable names. For example: > > > > A+="1" # Change variable ${A}, '+' glued to '=' > > A+ = "1" # Change variable ${A+} > > > > +="1" # Change variable ${+}, the '+' symbol not part of assignment operator > > + = "1" # Change variable ${+} > > > > New code would always assume that '.=', '+=', and ':=' is assignment operator. > > As result code like the following would raise parsing error > > > > +="value" > > > > While code with extra spaces would work as before > > > > + = "value" # Change variable ${+} > > > > This change allow to catch issues in code that generate bitbake configuration > > files in a manner like "echo ${VARNAME}+=${VALUE} >> conf/local.conf" > > > > Signed-off-by: Nikolai Merinov <n.merinov@inango-systems.com> > > --- > > lib/bb/parse/parse_py/ConfHandler.py | 2 +- > > lib/toaster/toastermain/management/commands/buildimport.py | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/lib/bb/parse/parse_py/ConfHandler.py b/lib/bb/parse/parse_py/ConfHandler.py > > index 24f81f7e9..bdac66004 100644 > > --- a/lib/bb/parse/parse_py/ConfHandler.py > > +++ b/lib/bb/parse/parse_py/ConfHandler.py > > @@ -20,7 +20,7 @@ from bb.parse import ParseError, resolve_file, ast, logger, handle > > __config_regexp__ = re.compile( r""" > > ^ > > (?P<exp>export\s+)? > > - (?P<var>[a-zA-Z0-9\-_+.${}/~:]+?) > > + (?P<var>([a-zA-Z0-9\-_${}/~] | [+.:](?!=))+?) > > (\[(?P<flag>[a-zA-Z0-9\-_+.][a-zA-Z0-9\-_+.@/]*)\])? > > > > \s* ( > > diff --git a/lib/toaster/toastermain/management/commands/buildimport.py b/lib/toaster/toastermain/management/commands/buildimport.py > > index f7139aa04..37d27a445 100644 > > --- a/lib/toaster/toastermain/management/commands/buildimport.py > > +++ b/lib/toaster/toastermain/management/commands/buildimport.py > > @@ -60,7 +60,7 @@ def _log(msg): > > __config_regexp__ = re.compile( r""" > > ^ > > (?P<exp>export\s+)? > > - (?P<var>[a-zA-Z0-9\-_+.${}/~]+?) > > + (?P<var>([a-zA-Z0-9\-_${}/~] | [.+](?!=))+?) > > (\[(?P<flag>[a-zA-Z0-9\-_+.]+)\])? > > > > \s* ( > > -- > > 2.34.1 > > > > The above made me ponder on the existence of $, { and } in the regular > expression for the variable name and what that can be (ab)used for. I > realized that it allows you do define really odd variables such as: > > ${} = "foobar" > } = "foobar" > $ = "foobar" > > And while you cannot access them directly using, e.g., "${$}", you can > access them using, e.g., "${@d.getVar('$')"... > > This also allows defining variables that do not match the syntax at all, > e.g.: > > EQUAL = "=" > ${EQUAL} = "bar" > > which will define the variable "=" and where "${@d.getVar('=')}" then > returns "bar" as expected... > > I am not sure this is a problem or working as intended, but it is a bit > odd. > The set/getVar API doesn't really sanity check the variable names so it is behaving much as I'd have expected. Whether it should be a bit stricter about variable names is a good question... Cheers, Richard
Hi Richard, > The test name needs changing from test_parse_exports though else it > will redefinine one of the other tests. Thank you for noticing this! I checked that I got test failure without rest part of my patch and immediately forget to ajust test name. Feel free to fix naming if it looks incorrectfor you. Regards, Nikolai
Hi Peter, > The above made me ponder on the existence of $, { and } in the regular > expression for the variable name and what that can be (ab)used for. I > realized that it allows you do define really odd variables such as: > > ${} = "foobar" > } = "foobar" > $ = "foobar" > > And while you cannot access them directly using, e.g., "${$}", you can > access them using, e.g., "${@d.getVar('$')"... > > This also allows defining variables that do not match the syntax at all, > e.g.: > > EQUAL = "=" > ${EQUAL} = "bar" > > which will define the variable "=" and where "${@d.getVar('=')}" then > returns "bar" as expected... > > I am not sure this is a problem or working as intended, but it is a bit > odd. I did not aware of bitbake development history, but I believe that there can be only one reason to do it: Makefile allow same kind of bizarre things. $ cat Makefile name = ==არ გააკეთო== $(name) = Hello World all: ; echo $(==არ გააკეთო==) $ make echo Hello World Hello World $ Regards, Nikolai
diff --git a/lib/bb/parse/parse_py/ConfHandler.py b/lib/bb/parse/parse_py/ConfHandler.py index 24f81f7e9..bdac66004 100644 --- a/lib/bb/parse/parse_py/ConfHandler.py +++ b/lib/bb/parse/parse_py/ConfHandler.py @@ -20,7 +20,7 @@ from bb.parse import ParseError, resolve_file, ast, logger, handle __config_regexp__ = re.compile( r""" ^ (?P<exp>export\s+)? - (?P<var>[a-zA-Z0-9\-_+.${}/~:]+?) + (?P<var>([a-zA-Z0-9\-_${}/~] | [+.:](?!=))+?) (\[(?P<flag>[a-zA-Z0-9\-_+.][a-zA-Z0-9\-_+.@/]*)\])? \s* ( diff --git a/lib/toaster/toastermain/management/commands/buildimport.py b/lib/toaster/toastermain/management/commands/buildimport.py index f7139aa04..37d27a445 100644 --- a/lib/toaster/toastermain/management/commands/buildimport.py +++ b/lib/toaster/toastermain/management/commands/buildimport.py @@ -60,7 +60,7 @@ def _log(msg): __config_regexp__ = re.compile( r""" ^ (?P<exp>export\s+)? - (?P<var>[a-zA-Z0-9\-_+.${}/~]+?) + (?P<var>([a-zA-Z0-9\-_${}/~] | [.+](?!=))+?) (\[(?P<flag>[a-zA-Z0-9\-_+.]+)\])? \s* (
Old code that parse variable names in assignment commands behave differntly for variables that ends with special symbol for single-character variable names and multi-character variable names. For example: A+="1" # Change variable ${A}, '+' glued to '=' A+ = "1" # Change variable ${A+} +="1" # Change variable ${+}, the '+' symbol not part of assignment operator + = "1" # Change variable ${+} New code would always assume that '.=', '+=', and ':=' is assignment operator. As result code like the following would raise parsing error +="value" While code with extra spaces would work as before + = "value" # Change variable ${+} This change allow to catch issues in code that generate bitbake configuration files in a manner like "echo ${VARNAME}+=${VALUE} >> conf/local.conf" Signed-off-by: Nikolai Merinov <n.merinov@inango-systems.com> --- lib/bb/parse/parse_py/ConfHandler.py | 2 +- lib/toaster/toastermain/management/commands/buildimport.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)