diff mbox series

ast/data/codeparser: Add dependencies from python module functions

Message ID 20221212160334.54431-1-richard.purdie@linuxfoundation.org
State Accepted, archived
Commit 91441e157e495b02db44e19e836afad366ee8924
Headers show
Series ast/data/codeparser: Add dependencies from python module functions | expand

Commit Message

Richard Purdie Dec. 12, 2022, 4:03 p.m. UTC
Moving code into python modules is a very effective way to reduce parsing
time and overhead in recipes. The downside has always been that any
dependency information on which variables those functions access is lost
and the hashes can therefore become less reliable.

This patch adds parsing of the imported module functions and that dependency
information is them injected back into the hash dependency information.

Intermodule function references are resolved to the full function
call names in our module namespace to ensure interfunction dependencies
are correctly handled too.

(Bitbake rev: 605c478ce14cdc3c02d6ef6d57146a76d436a83c)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/codeparser.py       | 44 +++++++++++++++++++++++++++++++++-----
 lib/bb/data.py             | 15 +++++++++----
 lib/bb/parse/ast.py        | 12 +++++++++++
 lib/bb/tests/codeparser.py | 14 ++++++------
 4 files changed, 69 insertions(+), 16 deletions(-)

Comments

Martin Jansa Feb. 15, 2023, 3:19 p.m. UTC | #1
On Mon, Dec 12, 2022 at 5:03 PM Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> Moving code into python modules is a very effective way to reduce parsing
> time and overhead in recipes. The downside has always been that any
> dependency information on which variables those functions access is lost
> and the hashes can therefore become less reliable.
>
> This patch adds parsing of the imported module functions and that
> dependency
> information is them injected back into the hash dependency information.
>
> Intermodule function references are resolved to the full function
> call names in our module namespace to ensure interfunction dependencies
> are correctly handled too.
>
> (Bitbake rev: 605c478ce14cdc3c02d6ef6d57146a76d436a83c)
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
>
...

> diff --git a/lib/bb/parse/ast.py b/lib/bb/parse/ast.py
> index 862087c77d..375ba3cb79 100644
> --- a/lib/bb/parse/ast.py
> +++ b/lib/bb/parse/ast.py
> @@ -290,6 +290,18 @@ class PyLibNode(AstNode):
>              toimport = getattr(bb.utils._context[self.namespace],
> "BBIMPORTS", [])
>              for i in toimport:
>                  bb.utils._context[self.namespace] =
> __import__(self.namespace + "." + i)
> +                mod = getattr(bb.utils._context[self.namespace], i)
> +                fn = getattr(mod, "__file__")
> +                funcs = {}
> +                for f in dir(mod):
> +                    if f.startswith("_"):
> +                        continue
> +                    fcall = getattr(mod, f)
> +                    if not callable(fcall):
> +                        continue
> +                    funcs[f] = fcall
> +                bb.codeparser.add_module_functions(fn, funcs, "%s.%s" %
> (self.namespace, i))
> +
>          except AttributeError as e:
>              bb.error("Error importing OE modules: %s" % str(e))
>

Hi,

a while ago I've mentioned that after making syntax error in metadata I got
no output at all from bitbake.

I've narrowed it down to this chunk and the issue is that my syntax error
was in qa.py and it was failing in:
bb.utils._context[self.namespace] = __import__(self.namespace + "." + i)

without any output from bitbake (nor bitbake-cookerdaemon.log).

If I add simple try-catch around this like:
-                bb.utils._context[self.namespace] =
__import__(self.namespace + "." + i)
+                try:
+                    bb.utils._context[self.namespace] =
__import__(self.namespace + "." + i)
+                except Exception as e:
+                    bb.error("Error importing OE module %s: %s" % (i,
str(e)))
+                    continue

Then I immediately get expected syntax error:
ERROR: Error importing OE module qa: invalid syntax (qa.py, line 229)
(followed by many errors about missing qa "Exception: AttributeError:
module 'oe' has no attribute 'qa'")

Where would be the right place to handle this?

If I break qa.py in langdale, I get reasonable error:
$ echo "def break()" >> oe-core/meta/lib/oe/qa.py
$ bitbake -k zlib-native
ERROR: Unable to parse Var <OE_IMPORTED[:=]>
Traceback (most recent call last):
  File "Var <OE_IMPORTED[:=]>", line 1, in <module>
  File
"/OE/lge/build/webos/langdale/oe-core/meta/classes-global/base.bbclass",
line 35, in oe_import(d=<bb.data_smart.DataSmart object at 0x7fa09b142790>):
                 # Make a python object accessible from the metadata
    >            bb.utils._context[toimport.split(".", 1)[0]] =
__import__(toimport)
             except AttributeError as e:
bb.data_smart.ExpansionError: Failure expanding variable OE_IMPORTED[:=],
expression was ${@oe_import(d)} which triggered exception SyntaxError:
expected ':' (qa.py, line 222)
The variable dependency chain for the failure is: OE_IMPORTED[:=]

with master:
$ time bitbake -DDDDDD -vvvvv zlib-native

real    0m0.564s
user    0m0.149s
sys     0m0.033s
Richard Purdie Feb. 15, 2023, 4:28 p.m. UTC | #2
On Wed, 2023-02-15 at 16:19 +0100, Martin Jansa wrote:
> On Mon, Dec 12, 2022 at 5:03 PM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> > Moving code into python modules is a very effective way to reduce
> > parsing
> > time and overhead in recipes. The downside has always been that any
> > dependency information on which variables those functions access is
> > lost
> > and the hashes can therefore become less reliable.
> > 
> > This patch adds parsing of the imported module functions and that
> > dependency
> > information is them injected back into the hash dependency
> > information.
> > 
> > Intermodule function references are resolved to the full function
> > call names in our module namespace to ensure interfunction
> > dependencies
> > are correctly handled too.
> > 
> > (Bitbake rev: 605c478ce14cdc3c02d6ef6d57146a76d436a83c)
> > 
> > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> > 
> 
> ... 
> > diff --git a/lib/bb/parse/ast.py b/lib/bb/parse/ast.py
> > index 862087c77d..375ba3cb79 100644
> > --- a/lib/bb/parse/ast.py
> > +++ b/lib/bb/parse/ast.py
> > @@ -290,6 +290,18 @@ class PyLibNode(AstNode):
> >              toimport = getattr(bb.utils._context[self.namespace],
> > "BBIMPORTS", [])
> >              for i in toimport:
> >                  bb.utils._context[self.namespace] =
> > __import__(self.namespace + "." + i)
> > +                mod = getattr(bb.utils._context[self.namespace],
> > i)
> > +                fn = getattr(mod, "__file__")
> > +                funcs = {}
> > +                for f in dir(mod):
> > +                    if f.startswith("_"):
> > +                        continue
> > +                    fcall = getattr(mod, f)
> > +                    if not callable(fcall):
> > +                        continue
> > +                    funcs[f] = fcall
> > +                bb.codeparser.add_module_functions(fn, funcs,
> > "%s.%s" % (self.namespace, i))
> > +
> >          except AttributeError as e:
> >              bb.error("Error importing OE modules: %s" % str(e))
> > 
> 
> 
> Hi,
> 
> a while ago I've mentioned that after making syntax error in metadata
> I got no output at all from bitbake.
> 
> I've narrowed it down to this chunk and the issue is that my syntax
> error was in qa.py and it was failing in:
> bb.utils._context[self.namespace] = __import__(self.namespace + "." +
> i)

Was there a traceback in bitbake-cookerdeamon.log for this before your
patch?

Cheers,

Richard
Martin Jansa Feb. 15, 2023, 4:45 p.m. UTC | #3
No, I haven't seen any error in bitbake-cookerdaemon.log, see attachment
(created with this change reverted and the same syntax error in qa.py).

On Wed, Feb 15, 2023 at 5:28 PM Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> On Wed, 2023-02-15 at 16:19 +0100, Martin Jansa wrote:
> > On Mon, Dec 12, 2022 at 5:03 PM Richard Purdie
> > <richard.purdie@linuxfoundation.org> wrote:
> > > Moving code into python modules is a very effective way to reduce
> > > parsing
> > > time and overhead in recipes. The downside has always been that any
> > > dependency information on which variables those functions access is
> > > lost
> > > and the hashes can therefore become less reliable.
> > >
> > > This patch adds parsing of the imported module functions and that
> > > dependency
> > > information is them injected back into the hash dependency
> > > information.
> > >
> > > Intermodule function references are resolved to the full function
> > > call names in our module namespace to ensure interfunction
> > > dependencies
> > > are correctly handled too.
> > >
> > > (Bitbake rev: 605c478ce14cdc3c02d6ef6d57146a76d436a83c)
> > >
> > > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> > >
> >
> > ...
> > > diff --git a/lib/bb/parse/ast.py b/lib/bb/parse/ast.py
> > > index 862087c77d..375ba3cb79 100644
> > > --- a/lib/bb/parse/ast.py
> > > +++ b/lib/bb/parse/ast.py
> > > @@ -290,6 +290,18 @@ class PyLibNode(AstNode):
> > >              toimport = getattr(bb.utils._context[self.namespace],
> > > "BBIMPORTS", [])
> > >              for i in toimport:
> > >                  bb.utils._context[self.namespace] =
> > > __import__(self.namespace + "." + i)
> > > +                mod = getattr(bb.utils._context[self.namespace],
> > > i)
> > > +                fn = getattr(mod, "__file__")
> > > +                funcs = {}
> > > +                for f in dir(mod):
> > > +                    if f.startswith("_"):
> > > +                        continue
> > > +                    fcall = getattr(mod, f)
> > > +                    if not callable(fcall):
> > > +                        continue
> > > +                    funcs[f] = fcall
> > > +                bb.codeparser.add_module_functions(fn, funcs,
> > > "%s.%s" % (self.namespace, i))
> > > +
> > >          except AttributeError as e:
> > >              bb.error("Error importing OE modules: %s" % str(e))
> > >
> >
> >
> > Hi,
> >
> > a while ago I've mentioned that after making syntax error in metadata
> > I got no output at all from bitbake.
> >
> > I've narrowed it down to this chunk and the issue is that my syntax
> > error was in qa.py and it was failing in:
> > bb.utils._context[self.namespace] = __import__(self.namespace + "." +
> > i)
>
> Was there a traceback in bitbake-cookerdeamon.log for this before your
> patch?
>
> Cheers,
>
> Richard
>
Richard Purdie Feb. 17, 2023, 10:39 a.m. UTC | #4
On Wed, 2023-02-15 at 17:45 +0100, Martin Jansa wrote:
> No, I haven't seen any error in bitbake-cookerdaemon.log, see
> attachment (created with this change reverted and the same syntax
> error in qa.py).

Which makes me wonder where this error is actually ending up :/

Cheers,

Richard
diff mbox series

Patch

diff --git a/lib/bb/codeparser.py b/lib/bb/codeparser.py
index 9d66d3ae41..87846d85d6 100644
--- a/lib/bb/codeparser.py
+++ b/lib/bb/codeparser.py
@@ -27,6 +27,7 @@  import ast
 import sys
 import codegen
 import logging
+import inspect
 import bb.pysh as pysh
 import bb.utils, bb.data
 import hashlib
@@ -58,10 +59,34 @@  def check_indent(codestr):
 
     return codestr
 
-# A custom getstate/setstate using tuples is actually worth 15% cachesize by
-# avoiding duplication of the attribute names!
+modulecode_deps = {}
 
+def add_module_functions(fn, functions, namespace):
+    fstat = os.stat(fn)
+    fixedhash = fn + ":" + str(fstat.st_size) +  ":" + str(fstat.st_mtime)
+    #bb.warn(fixedhash)
+    for f in functions:
+        name = "%s.%s" % (namespace, f)
+        parser = bb.codeparser.PythonParser(name, logger)
+        try:
+            parser.parse_python(None, filename=fn, lineno=1, fixedhash=fixedhash+f)
+            #bb.warn("Cached %s" % f)
+        except KeyError:
+            lines, lineno = inspect.getsourcelines(functions[f])
+            src = "".join(lines)
+            parser.parse_python(src, filename=fn, lineno=lineno, fixedhash=fixedhash+f)
+            #bb.warn("Not cached %s" % f)
+        execs = parser.execs.copy()
+        # Expand internal module exec references
+        for e in parser.execs:
+            if e in functions:
+                execs.remove(e)
+                execs.add(namespace + "." + e)
+        bb.codeparser.modulecode_deps[name] = [parser.references.copy(), execs, parser.var_execs.copy(), parser.contains.copy()]
+        #bb.warn("%s: %s\nRefs:%s Execs: %s %s %s" % (name, src, parser.references, parser.execs, parser.var_execs, parser.contains))
 
+# A custom getstate/setstate using tuples is actually worth 15% cachesize by
+# avoiding duplication of the attribute names!
 class SetCache(object):
     def __init__(self):
         self.setcache = {}
@@ -289,11 +314,17 @@  class PythonParser():
         self.unhandled_message = "in call of %s, argument '%s' is not a string literal"
         self.unhandled_message = "while parsing %s, %s" % (name, self.unhandled_message)
 
-    def parse_python(self, node, lineno=0, filename="<string>"):
-        if not node or not node.strip():
+    # For the python module code it is expensive to have the function text so it is
+    # uses a different fixedhash to cache against. We can take the hit on obtaining the
+    # text if it isn't in the cache.
+    def parse_python(self, node, lineno=0, filename="<string>", fixedhash=None):
+        if not fixedhash and (not node or not node.strip()):
             return
 
-        h = bbhash(str(node))
+        if fixedhash:
+            h = fixedhash
+        else:
+            h = bbhash(str(node))
 
         if h in codeparsercache.pythoncache:
             self.references = set(codeparsercache.pythoncache[h].refs)
@@ -311,6 +342,9 @@  class PythonParser():
                 self.contains[i] = set(codeparsercache.pythoncacheextras[h].contains[i])
             return
 
+        if fixedhash and not node:
+            raise KeyError
+
         # Need to parse so take the hit on the real log buffer
         self.log = BufferedLogger('BitBake.Data.PythonParser', logging.DEBUG, self._log)
 
diff --git a/lib/bb/data.py b/lib/bb/data.py
index bfaa0410ea..841369699e 100644
--- a/lib/bb/data.py
+++ b/lib/bb/data.py
@@ -261,7 +261,7 @@  def emit_func_python(func, o=sys.__stdout__, d = init()):
                newdeps |= set((d.getVarFlag(dep, "vardeps") or "").split())
         newdeps -= seen
 
-def build_dependencies(key, keys, shelldeps, varflagsexcl, ignored_vars, d):
+def build_dependencies(key, keys, mod_funcs, shelldeps, varflagsexcl, ignored_vars, d):
     def handle_contains(value, contains, exclusions, d):
         newvalue = []
         if value:
@@ -289,6 +289,12 @@  def build_dependencies(key, keys, shelldeps, varflagsexcl, ignored_vars, d):
 
     deps = set()
     try:
+        if key in mod_funcs:
+            exclusions = set()
+            moddep = bb.codeparser.modulecode_deps[key]
+            value = handle_contains("", moddep[3], exclusions, d)
+            return frozenset((moddep[0] | keys & moddep[1]) - ignored_vars), value
+
         if key[-1] == ']':
             vf = key[:-1].split('[')
             if vf[1] == "vardepvalueexclude":
@@ -367,7 +373,8 @@  def build_dependencies(key, keys, shelldeps, varflagsexcl, ignored_vars, d):
 
 def generate_dependencies(d, ignored_vars):
 
-    keys = set(key for key in d if not key.startswith("__"))
+    mod_funcs = set(bb.codeparser.modulecode_deps.keys())
+    keys = set(key for key in d if not key.startswith("__")) | mod_funcs
     shelldeps = set(key for key in d.getVar("__exportlist", False) if d.getVarFlag(key, "export", False) and not d.getVarFlag(key, "unexport", False))
     varflagsexcl = d.getVar('BB_SIGNATURE_EXCLUDE_FLAGS')
 
@@ -376,7 +383,7 @@  def generate_dependencies(d, ignored_vars):
 
     tasklist = d.getVar('__BBTASKS', False) or []
     for task in tasklist:
-        deps[task], values[task] = build_dependencies(task, keys, shelldeps, varflagsexcl, ignored_vars, d)
+        deps[task], values[task] = build_dependencies(task, keys, mod_funcs, shelldeps, varflagsexcl, ignored_vars, d)
         newdeps = deps[task]
         seen = set()
         while newdeps:
@@ -385,7 +392,7 @@  def generate_dependencies(d, ignored_vars):
             newdeps = set()
             for dep in nextdeps:
                 if dep not in deps:
-                    deps[dep], values[dep] = build_dependencies(dep, keys, shelldeps, varflagsexcl, ignored_vars, d)
+                    deps[dep], values[dep] = build_dependencies(dep, keys, mod_funcs, shelldeps, varflagsexcl, ignored_vars, d)
                 newdeps |=  deps[dep]
             newdeps -= seen
         #print "For %s: %s" % (task, str(deps[task]))
diff --git a/lib/bb/parse/ast.py b/lib/bb/parse/ast.py
index 862087c77d..375ba3cb79 100644
--- a/lib/bb/parse/ast.py
+++ b/lib/bb/parse/ast.py
@@ -290,6 +290,18 @@  class PyLibNode(AstNode):
             toimport = getattr(bb.utils._context[self.namespace], "BBIMPORTS", [])
             for i in toimport:
                 bb.utils._context[self.namespace] = __import__(self.namespace + "." + i)
+                mod = getattr(bb.utils._context[self.namespace], i)
+                fn = getattr(mod, "__file__")
+                funcs = {}
+                for f in dir(mod):
+                    if f.startswith("_"):
+                        continue
+                    fcall = getattr(mod, f)
+                    if not callable(fcall):
+                        continue
+                    funcs[f] = fcall
+                bb.codeparser.add_module_functions(fn, funcs, "%s.%s" % (self.namespace, i))
+
         except AttributeError as e:
             bb.error("Error importing OE modules: %s" % str(e))
 
diff --git a/lib/bb/tests/codeparser.py b/lib/bb/tests/codeparser.py
index 71ed382ab8..a508f23bcb 100644
--- a/lib/bb/tests/codeparser.py
+++ b/lib/bb/tests/codeparser.py
@@ -318,7 +318,7 @@  d.getVar(a(), False)
             "filename": "example.bb",
         })
 
-        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), self.d)
+        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), set(), self.d)
 
         self.assertEqual(deps, set(["somevar", "bar", "something", "inexpand", "test", "test2", "a"]))
 
@@ -365,7 +365,7 @@  esac
         self.d.setVarFlags("FOO", {"func": True})
         self.setEmptyVars(execs)
 
-        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), self.d)
+        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), set(), self.d)
 
         self.assertEqual(deps, set(["somevar", "inverted"] + execs))
 
@@ -375,7 +375,7 @@  esac
         self.d.setVar("FOO", "foo=oe_libinstall; eval $foo")
         self.d.setVarFlag("FOO", "vardeps", "oe_libinstall")
 
-        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), self.d)
+        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), set(), self.d)
 
         self.assertEqual(deps, set(["oe_libinstall"]))
 
@@ -384,7 +384,7 @@  esac
         self.d.setVar("FOO", "foo=oe_libinstall; eval $foo")
         self.d.setVarFlag("FOO", "vardeps", "${@'oe_libinstall'}")
 
-        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), self.d)
+        deps, values = bb.data.build_dependencies("FOO", set(self.d.keys()), set(), set(), set(), set(), self.d)
 
         self.assertEqual(deps, set(["oe_libinstall"]))
 
@@ -399,7 +399,7 @@  esac
         # Check dependencies
         self.d.setVar('ANOTHERVAR', expr)
         self.d.setVar('TESTVAR', 'anothervalue testval testval2')
-        deps, values = bb.data.build_dependencies("ANOTHERVAR", set(self.d.keys()), set(), set(), set(), self.d)
+        deps, values = bb.data.build_dependencies("ANOTHERVAR", set(self.d.keys()), set(), set(), set(), set(), self.d)
         self.assertEqual(sorted(values.splitlines()),
                          sorted([expr,
                           'TESTVAR{anothervalue} = Set',
@@ -418,14 +418,14 @@  esac
         self.d.setVar('ANOTHERVAR', varval)
         self.d.setVar('TESTVAR', 'anothervalue testval testval2')
         self.d.setVar('TESTVAR2', 'testval3')
-        deps, values = bb.data.build_dependencies("ANOTHERVAR", set(self.d.keys()), set(), set(), set(["TESTVAR"]), self.d)
+        deps, values = bb.data.build_dependencies("ANOTHERVAR", set(self.d.keys()), set(), set(), set(), set(["TESTVAR"]), self.d)
         self.assertEqual(sorted(values.splitlines()), sorted([varval]))
         self.assertEqual(deps, set(["TESTVAR2"]))
         self.assertEqual(self.d.getVar('ANOTHERVAR').split(), ['testval3', 'anothervalue'])
 
         # Check the vardepsexclude flag is handled by contains functionality
         self.d.setVarFlag('ANOTHERVAR', 'vardepsexclude', 'TESTVAR')
-        deps, values = bb.data.build_dependencies("ANOTHERVAR", set(self.d.keys()), set(), set(), set(), self.d)
+        deps, values = bb.data.build_dependencies("ANOTHERVAR", set(self.d.keys()), set(), set(), set(), set(), self.d)
         self.assertEqual(sorted(values.splitlines()), sorted([varval]))
         self.assertEqual(deps, set(["TESTVAR2"]))
         self.assertEqual(self.d.getVar('ANOTHERVAR').split(), ['testval3', 'anothervalue'])