diff mbox series

parse/ConfHandler: Add warning for deprecated whitespace usage

Message ID 20250402091303.2346229-1-richard.purdie@linuxfoundation.org
State Accepted, archived
Commit 24772dd2ae6c0cd11540a260f15065f906fb0997
Headers show
Series parse/ConfHandler: Add warning for deprecated whitespace usage | expand

Commit Message

Richard Purdie April 2, 2025, 9:13 a.m. UTC
A lack of whitespace around variable assignment operators makes the
files harder to read.

There is a deeper issue in that a "+" character can sometimes be confused
between the variable name and the assignment operator.

Start showing warnings for such usage so we encourage people to use
consistent whitespace which helps with file readability in general.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/parse/parse_py/ConfHandler.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Martin Jansa April 4, 2025, 3:12 p.m. UTC | #1
Acked-by: Martin Jansa <martin.jansa@gmail.com>

I've sent an update to avoid warnings for few more layers which are
included in some of my builds (meta-filesystems, meta-arm,
meta-virtualization, meta-webos-ports, meta-webosose, meta-qt5,
meta-qt6, meta-raspberrypi and internal LGE layers).

I wrote a silly script which quickly proved that writing regexp with
sed which works correctly across all various shell/python/bitbake
syntaxes supported by bitbake is more difficult than initially
expected (as it can contain shell variables, bitbake variables, $, ',
", [], $@, ...).

That's why there are multiple versions (which might help in some cases
if you layer has many similar cases) and the comments documenting the
limitations and false-positives are longer than the original one-liner
I was using for the migration. Last couple cases where the script
didn't work I've usually adjusted manually. It was still useful for me
when doing the migrations and feel free to use it at your own risk.
"bitbake -nk world 2>&1 | tee world.log" and then pass the world.log
to the script.

#!/bin/bash

# https://lists.openembedded.org/g/bitbake-devel/message/17508

LOG=$1

# it's a bit silly and simple but resolved 99% of the issues in layers I'm using
grep WARNING:.*whitespace.around.the.assignment $LOG | sort -u | while
read LINE; do
    # there could be 2 paths as in:
    # WARNING: meta-webos/recipes-core/packagegroups/packagegroup-core-tools-profile.bbappend:10
has a lack of whitespace around the assignment: 'LTTNGTOOLS:no-lttng=
""'
    # WARNING: oe-core/meta/recipes-core/packagegroups/packagegroup-core-tools-profile.bb:
meta-webos/recipes-core/packagegroups/packagegroup-core-tools-profile.bbappend:10
has a lack of whitespace around the assignment: 'LTTNGTOOLS:no-lt
tng= ""'
    # but some global files will be shown only once e.g.:
    # WARNING: meta-foo/conf/layer.conf:10 has a lack of whitespace
around the assignment: 'BBFILE_PATTERN_meta-foo:= "^${LAYERDIR}/"'
    # the paths are full, I've just stripped the TOPDIR prefix
    # always take the one with lineno which is what we want to change

    f=`echo $LINE | sed "s/^WARNING: \([^:]*:
\)*\([^:]\+\):\([0-9]\+\) has a lack of whitespace around the
assignment: '\(.*\)'$/\2/g"`
    #l=`echo $LINE | sed "s/^WARNING: \([^:]*:
\)*\([^:]\+\):\([0-9]\+\) has a lack of whitespace around the
assignment: '\(.*\)'$/\3/g"`
    p=`echo $LINE | sed "s/^WARNING: \([^:]*:
\)*\([^:]\+\):\([0-9]\+\) has a lack of whitespace around the
assignment: '\(.*\)'$/\4/g"`

    if [ ! -f "$f" ] ; then
        echo "ERROR: failed to parse filename from $LINE"
        continue
    fi

    for o in ??= ?= += =+ := \\\\.= =\\\\. =; do
        # replace only first one found to avoid adding space in ??=
(when doing ?=) or += (when doing =)
        if echo $p | grep -qG "[^ ]$o" || echo $p | grep -qG "$o[^ ]"; then
            pp=$(echo $p | sed "s/ *$o */ $o /g; s/^'//g; s/'$//g")
            echo "sed -i 's@$p@$pp@g' $f"
            # some cases might have ^ like in "^${LAYERDIR}/", so use
e.g. @ to convert them,
            # but @ is often used in ${@..} so pick a character not
used in the WARNINGs you got.

            # there are some other limitations as well as some
patterns might have ' or " in them
            # or both typically with ${@..}, e.g. one of mine:
            # 'WEBOS_SUBMISSION_NUMBER="${@
'${WEBOS_SUBMISSION}'.split('.')[-1] }"'
            # either fix those manually or with manual sed matching
just a part of it.

            # another case where this won't work are variables with
line breaks e.g.:
            # 'CXXFLAGS +="
-I${STAGING_INCDIR}/media-resource-calculator-clang
-I${STAGING_INCDIR}/cbe "'
            # is triggered on recipe which has:
            # CXXFLAGS +=" \
            #    -I${STAGING_INCDIR}/media-resource-calculator-clang \
            #    -I${STAGING_INCDIR}/cbe \
            # "
            # and the regexp again doesn't work

            # it also doesn't work well on cases like:
            # 'EXTRA_OEMAKE="BUILDTAGS=''"'
            # often used in meta-virtualization where only first =
should have spaces around

            # sed -i 's^'"$p"'^'"$pp"'^g' $f
            # sed -i 's@'"$p"'@'"$pp"'@g' $f
            sed -i 's&'"$p"'&'"$pp"'&g' $f
            break;
        fi
    done
done

On Wed, Apr 2, 2025 at 11:13 AM Richard Purdie via
lists.openembedded.org
<richard.purdie=linuxfoundation.org@lists.openembedded.org> wrote:
>
> A lack of whitespace around variable assignment operators makes the
> files harder to read.
>
> There is a deeper issue in that a "+" character can sometimes be confused
> between the variable name and the assignment operator.
>
> Start showing warnings for such usage so we encourage people to use
> consistent whitespace which helps with file readability in general.
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  lib/bb/parse/parse_py/ConfHandler.py | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/lib/bb/parse/parse_py/ConfHandler.py b/lib/bb/parse/parse_py/ConfHandler.py
> index 1bde597254..675838d845 100644
> --- a/lib/bb/parse/parse_py/ConfHandler.py
> +++ b/lib/bb/parse/parse_py/ConfHandler.py
> @@ -23,7 +23,7 @@ __config_regexp__  = re.compile( r"""
>      (?P<var>[a-zA-Z0-9\-_+.${}/~:]*?)
>      (\[(?P<flag>[a-zA-Z0-9\-_+.][a-zA-Z0-9\-_+.@/]*)\])?
>
> -    \s* (
> +    (?P<whitespace>\s*) (
>          (?P<colon>:=) |
>          (?P<lazyques>\?\?=) |
>          (?P<ques>\?=) |
> @@ -32,7 +32,7 @@ __config_regexp__  = re.compile( r"""
>          (?P<predot>=\.) |
>          (?P<postdot>\.=) |
>          =
> -    ) \s*
> +    ) (?P<whitespace2>\s*)
>
>      (?!'[^']*'[^']*'$)
>      (?!\"[^\"]*\"[^\"]*\"$)
> @@ -168,6 +168,8 @@ def feeder(lineno, s, fn, statements, baseconfig=False, conffile=True):
>          groupd = m.groupdict()
>          if groupd['var'] == "":
>              raise ParseError("Empty variable name in assignment: '%s'" % s, fn, lineno);
> +        if not groupd['whitespace'] or not groupd['whitespace2']:
> +            logger.warning("%s:%s has a lack of whitespace around the assignment: '%s'" % (fn, lineno, s))
>          ast.handleData(statements, fn, lineno, groupd)
>          return
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#17508): https://lists.openembedded.org/g/bitbake-devel/message/17508
> Mute This Topic: https://lists.openembedded.org/mt/112043500/3617156
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [martin.jansa@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Martin Jansa April 10, 2025, 6:36 a.m. UTC | #2
On Fri, Apr 4, 2025 at 5:12 PM Martin Jansa via lists.openembedded.org
<martin.jansa=gmail.com@lists.openembedded.org> wrote:
>
> Acked-by: Martin Jansa <martin.jansa@gmail.com>
>
> I've sent an update to avoid warnings for few more layers which are
> included in some of my builds (meta-filesystems, meta-arm,
> meta-virtualization, meta-webos-ports, meta-webosose, meta-qt5,
> meta-qt6, meta-raspberrypi and internal LGE layers).
>
> I wrote a silly script which quickly proved that writing regexp with
> sed which works correctly across all various shell/python/bitbake
> syntaxes supported by bitbake is more difficult than initially
> expected (as it can contain shell variables, bitbake variables, $, ',
> ", [], $@, ...).
>
> That's why there are multiple versions (which might help in some cases
> if you layer has many similar cases) and the comments documenting the
> limitations and false-positives are longer than the original one-liner
> I was using for the migration. Last couple cases where the script
> didn't work I've usually adjusted manually. It was still useful for me
> when doing the migrations and feel free to use it at your own risk.
> "bitbake -nk world 2>&1 | tee world.log" and then pass the world.log
> to the script.
>
> #!/bin/bash
>
> # https://lists.openembedded.org/g/bitbake-devel/message/17508
>
> LOG=$1
>
> # it's a bit silly and simple but resolved 99% of the issues in layers I'm using
> grep WARNING:.*whitespace.around.the.assignment $LOG | sort -u | while
> read LINE; do
>     # there could be 2 paths as in:
>     # WARNING: meta-webos/recipes-core/packagegroups/packagegroup-core-tools-profile.bbappend:10
> has a lack of whitespace around the assignment: 'LTTNGTOOLS:no-lttng=
> ""'
>     # WARNING: oe-core/meta/recipes-core/packagegroups/packagegroup-core-tools-profile.bb:
> meta-webos/recipes-core/packagegroups/packagegroup-core-tools-profile.bbappend:10
> has a lack of whitespace around the assignment: 'LTTNGTOOLS:no-lt
> tng= ""'
>     # but some global files will be shown only once e.g.:
>     # WARNING: meta-foo/conf/layer.conf:10 has a lack of whitespace
> around the assignment: 'BBFILE_PATTERN_meta-foo:= "^${LAYERDIR}/"'
>     # the paths are full, I've just stripped the TOPDIR prefix
>     # always take the one with lineno which is what we want to change
>
>     f=`echo $LINE | sed "s/^WARNING: \([^:]*:
> \)*\([^:]\+\):\([0-9]\+\) has a lack of whitespace around the
> assignment: '\(.*\)'$/\2/g"`
>     #l=`echo $LINE | sed "s/^WARNING: \([^:]*:
> \)*\([^:]\+\):\([0-9]\+\) has a lack of whitespace around the
> assignment: '\(.*\)'$/\3/g"`
>     p=`echo $LINE | sed "s/^WARNING: \([^:]*:
> \)*\([^:]\+\):\([0-9]\+\) has a lack of whitespace around the
> assignment: '\(.*\)'$/\4/g"`
>
>     if [ ! -f "$f" ] ; then
>         echo "ERROR: failed to parse filename from $LINE"
>         continue
>     fi
>
>     for o in ??= ?= += =+ := \\\\.= =\\\\. =; do
>         # replace only first one found to avoid adding space in ??=
> (when doing ?=) or += (when doing =)
>         if echo $p | grep -qG "[^ ]$o" || echo $p | grep -qG "$o[^ ]"; then
>             pp=$(echo $p | sed "s/ *$o */ $o /g; s/^'//g; s/'$//g")
>             echo "sed -i 's@$p@$pp@g' $f"
>             # some cases might have ^ like in "^${LAYERDIR}/", so use
> e.g. @ to convert them,
>             # but @ is often used in ${@..} so pick a character not
> used in the WARNINGs you got.
>
>             # there are some other limitations as well as some
> patterns might have ' or " in them
>             # or both typically with ${@..}, e.g. one of mine:
>             # 'WEBOS_SUBMISSION_NUMBER="${@
> '${WEBOS_SUBMISSION}'.split('.')[-1] }"'
>             # either fix those manually or with manual sed matching
> just a part of it.
>
>             # another case where this won't work are variables with
> line breaks e.g.:
>             # 'CXXFLAGS +="
> -I${STAGING_INCDIR}/media-resource-calculator-clang
> -I${STAGING_INCDIR}/cbe "'
>             # is triggered on recipe which has:
>             # CXXFLAGS +=" \
>             #    -I${STAGING_INCDIR}/media-resource-calculator-clang \
>             #    -I${STAGING_INCDIR}/cbe \
>             # "
>             # and the regexp again doesn't work
>
>             # it also doesn't work well on cases like:
>             # 'EXTRA_OEMAKE="BUILDTAGS=''"'
>             # often used in meta-virtualization where only first =
> should have spaces around
>
>             # sed -i 's^'"$p"'^'"$pp"'^g' $f
>             # sed -i 's@'"$p"'@'"$pp"'@g' $f
>             sed -i 's&'"$p"'&'"$pp"'&g' $f
>             break;
>         fi
>     done
> done

Bruce was brave enough trying to use this yesterday and mentioned that
the long lines were wrapped in e-mail (as I haven't send it with git
send-email), here is it as a file:

https://github.com/shr-project/meta-webosose/blob/jansa/master/meta-webos/scripts/add-whitespace-around-assignment.sh

I also forgot to mention that the warnings are shown only when it's
not parsed in cache, so if you want to re-run it after initial pass
then remove the parser cache before calling bitbake again.

Cheers,
diff mbox series

Patch

diff --git a/lib/bb/parse/parse_py/ConfHandler.py b/lib/bb/parse/parse_py/ConfHandler.py
index 1bde597254..675838d845 100644
--- a/lib/bb/parse/parse_py/ConfHandler.py
+++ b/lib/bb/parse/parse_py/ConfHandler.py
@@ -23,7 +23,7 @@  __config_regexp__  = re.compile( r"""
     (?P<var>[a-zA-Z0-9\-_+.${}/~:]*?)
     (\[(?P<flag>[a-zA-Z0-9\-_+.][a-zA-Z0-9\-_+.@/]*)\])?
 
-    \s* (
+    (?P<whitespace>\s*) (
         (?P<colon>:=) |
         (?P<lazyques>\?\?=) |
         (?P<ques>\?=) |
@@ -32,7 +32,7 @@  __config_regexp__  = re.compile( r"""
         (?P<predot>=\.) |
         (?P<postdot>\.=) |
         =
-    ) \s*
+    ) (?P<whitespace2>\s*)
 
     (?!'[^']*'[^']*'$)
     (?!\"[^\"]*\"[^\"]*\"$)
@@ -168,6 +168,8 @@  def feeder(lineno, s, fn, statements, baseconfig=False, conffile=True):
         groupd = m.groupdict()
         if groupd['var'] == "":
             raise ParseError("Empty variable name in assignment: '%s'" % s, fn, lineno);
+        if not groupd['whitespace'] or not groupd['whitespace2']:
+            logger.warning("%s:%s has a lack of whitespace around the assignment: '%s'" % (fn, lineno, s))
         ast.handleData(statements, fn, lineno, groupd)
         return