diff mbox series

[RFC,1/2] add a "bitbake" pygments lexer

Message ID 20260421172416.1801567-2-twoerner@gmail.com
State Under Review
Headers show
Series support bitbake blocks | expand

Commit Message

Trevor Woerner April 21, 2026, 5:24 p.m. UTC
From: Trevor Woerner <trevor.woerner@amd.com>

Highlighting renders correctly across the BitBake constructs from
the upstream vim syntax:
  comments
  all assignment operators (=, ?=, +=, =+, :=, ??=, .=, =.)
  ${VAR} and ${@...} interpolation
  OE override chains (FILES:${PN}-doc, :append:class-target)
  varflags (do_install[depends])
  inherit/include/require
  addtask ... after ...  before ...
  EXPORT_FUNCTIONS
  shell function bodies (delegated to BashLexer)
  python/fakeroot python task bodies and top-level def blocks (delegated
    to PythonLexer).

Changes

  documentation/sphinx/bitbake.py is a new local Sphinx extension that
    defines BitbakeLexer (a Pygments RegexLexer) and registers it with
    Sphinx via sphinx.highlighting.lexers under the aliases bitbake and bb.
    Token rules mirror contrib/vim/syntax/bitbake.vim from the bitbake repo,
    and inner shell/python regions are dispatched to the existing BashLexer
    / PythonLexer using Pygments' using() helper.

  documentation/conf.py appends 'bitbake' to extensions so the lexer is
    loaded on every build.

Usage

  In any .rst file you can now write:

  .. code-block:: bitbake

     SUMMARY = "Hello"
     inherit autotools
     do_install:append() {
         install -d ${D}${bindir}
     }

  bb works as an alias if shorter is preferred, and the same applies to
  .. highlight:: bitbake.

AI-Generated: codex/claude-opus 4.7 (xhigh)
Signed-off-by: Trevor Woerner <trevor.woerner@amd.com>
---
 documentation/conf.py           |   3 +-
 documentation/sphinx/bitbake.py | 195 ++++++++++++++++++++++++++++++++
 2 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 documentation/sphinx/bitbake.py

Comments

Antonin Godard April 22, 2026, 6:55 a.m. UTC | #1
Hi,

On Tue Apr 21, 2026 at 7:24 PM CEST, Trevor Woerner via lists.yoctoproject.org wrote:
[...]
> +# Pygments does not ship a BitBake lexer, so this Sphinx extension provides
> +# one.

Have you considered contributing this lexer to Pygments instead? I would rather
have it there than having to maintain it here.

Thanks,
Antonin
Trevor Woerner April 22, 2026, 5:15 p.m. UTC | #2
On Wed 2026-04-22 @ 08:55:39 AM, Antonin Godard wrote:
> Hi,
> 
> On Tue Apr 21, 2026 at 7:24 PM CEST, Trevor Woerner via lists.yoctoproject.org wrote:
> [...]
> > +# Pygments does not ship a BitBake lexer, so this Sphinx extension provides
> > +# one.
> 
> Have you considered contributing this lexer to Pygments instead? I would rather
> have it there than having to maintain it here.

All development on Pygments occurs on github: https://github.com/pygments/pygments
Currently there are 143 open pull requests: https://github.com/pygments/pygments/pulls
(some going back to 2019)
However, the latest release was 3 weeks ago: https://github.com/pygments/pygments/releases

I will create a pull request for this feature and see where it goes. The
contributor information for the Pygments project is very clear that they
are not interested in adding/maintaining lexers for "pet" languages.

https://pygments.org/docs/contributing/

Of course *we* are all aware of how big the community is that uses
bitbake, but will we be able to convince them? bitbake's usage is large,
but compared to, say, something like c or python... All I can do is
submit and see. Realistically maintaining it ourselves is a likely
possibility if we're interested in this feature.

It was interesting to me to discover that our own documentation does not
know how to handle bitbake syntax.
Trevor Woerner April 22, 2026, 7:33 p.m. UTC | #3
On Wed 2026-04-22 @ 01:15:24 PM, Trevor Woerner wrote:
> On Wed 2026-04-22 @ 08:55:39 AM, Antonin Godard wrote:
> > Hi,
> > 
> > On Tue Apr 21, 2026 at 7:24 PM CEST, Trevor Woerner via lists.yoctoproject.org wrote:
> > [...]
> > > +# Pygments does not ship a BitBake lexer, so this Sphinx extension provides
> > > +# one.
> > 
> > Have you considered contributing this lexer to Pygments instead? I would rather
> > have it there than having to maintain it here.
> 
> All development on Pygments occurs on github: https://github.com/pygments/pygments
> Currently there are 143 open pull requests: https://github.com/pygments/pygments/pulls
> (some going back to 2019)
> However, the latest release was 3 weeks ago: https://github.com/pygments/pygments/releases
> 
> I will create a pull request for this feature and see where it goes. The
> contributor information for the Pygments project is very clear that they
> are not interested in adding/maintaining lexers for "pet" languages.
> 
> https://pygments.org/docs/contributing/
> 
> Of course *we* are all aware of how big the community is that uses
> bitbake, but will we be able to convince them? bitbake's usage is large,
> but compared to, say, something like c or python... All I can do is
> submit and see. Realistically maintaining it ourselves is a likely
> possibility if we're interested in this feature.

We'll see what they think: https://github.com/pygments/pygments/pull/3103

> 
> It was interesting to me to discover that our own documentation does not
> know how to handle bitbake syntax.
Antonin Godard April 23, 2026, 7:39 a.m. UTC | #4
Hi,

On Wed Apr 22, 2026 at 9:33 PM CEST, Trevor Woerner wrote:
> On Wed 2026-04-22 @ 01:15:24 PM, Trevor Woerner wrote:
>> On Wed 2026-04-22 @ 08:55:39 AM, Antonin Godard wrote:
>> > Hi,
>> > 
>> > On Tue Apr 21, 2026 at 7:24 PM CEST, Trevor Woerner via lists.yoctoproject.org wrote:
>> > [...]
>> > > +# Pygments does not ship a BitBake lexer, so this Sphinx extension provides
>> > > +# one.
>> > 
>> > Have you considered contributing this lexer to Pygments instead? I would rather
>> > have it there than having to maintain it here.
>> 
>> All development on Pygments occurs on github: https://github.com/pygments/pygments
>> Currently there are 143 open pull requests: https://github.com/pygments/pygments/pulls
>> (some going back to 2019)
>> However, the latest release was 3 weeks ago: https://github.com/pygments/pygments/releases
>> 
>> I will create a pull request for this feature and see where it goes. The
>> contributor information for the Pygments project is very clear that they
>> are not interested in adding/maintaining lexers for "pet" languages.
>> 
>> https://pygments.org/docs/contributing/
>> 
>> Of course *we* are all aware of how big the community is that uses
>> bitbake, but will we be able to convince them? bitbake's usage is large,
>> but compared to, say, something like c or python... All I can do is
>> submit and see. Realistically maintaining it ourselves is a likely
>> possibility if we're interested in this feature.
>
> We'll see what they think: https://github.com/pygments/pygments/pull/3103

BitBake does not change syntax often, I would say, so once it's settled in, I
think it won't have to change that much.

Thanks for the PR! Let's see where this goes.

Antonin
diff mbox series

Patch

diff --git a/documentation/conf.py b/documentation/conf.py
index e5e3a8a89abd..aa2ec8d45ed3 100644
--- a/documentation/conf.py
+++ b/documentation/conf.py
@@ -68,7 +68,8 @@  extensions = [
     'sphinx.ext.intersphinx',
     'sphinx_copybutton',
     'sphinxcontrib.rsvgconverter',
-    'yocto-vars'
+    'yocto-vars',
+    'bitbake',
 ]
 autosectionlabel_prefix_document = True
 
diff --git a/documentation/sphinx/bitbake.py b/documentation/sphinx/bitbake.py
new file mode 100644
index 000000000000..c521340c391f
--- /dev/null
+++ b/documentation/sphinx/bitbake.py
@@ -0,0 +1,195 @@ 
+#!/usr/bin/env python
+#
+# SPDX-License-Identifier: CC-BY-SA-2.0-UK
+#
+# Pygments lexer for BitBake metadata files (``.bb``, ``.bbclass``,
+# ``.bbappend``, ``.inc``, ``.conf``) and inline ``code-block:: bitbake``
+# snippets used throughout the Yocto Project documentation.
+#
+# Pygments does not ship a BitBake lexer, so this Sphinx extension provides
+# one. The token rules below mirror the upstream BitBake vim syntax shipped
+# in ``contrib/vim/syntax/bitbake.vim`` of the bitbake repository:
+# https://git.openembedded.org/bitbake/tree/contrib/vim/syntax/bitbake.vim
+#
+# Highlights:
+#   - comments, line continuations, varflags
+#   - ``${VAR}`` and ``${@python_expr}`` interpolations
+#   - assignment operators (``=``, ``:=``, ``+=``, ``=+``, ``.=``, ``=.``,
+#     ``?=``, ``??=``) with optional ``export`` prefix and OE override
+#     suffixes (``VAR:append``, ``FILES:${PN}-doc``...)
+#   - ``inherit``, ``include``, ``require`` directives
+#   - ``addtask`` / ``deltask`` / ``addhandler`` / ``EXPORT_FUNCTIONS``
+#     statements (with the ``after`` / ``before`` keywords)
+#   - shell task bodies (``do_install() { ... }``) delegated to ``BashLexer``
+#   - ``python``, ``fakeroot python``, anonymous ``python() { ... }`` task
+#     bodies and top-level ``def`` blocks delegated to ``PythonLexer``
+
+import re
+
+from pygments.lexer import RegexLexer, bygroups, include, using, words
+from pygments.lexers.python import PythonLexer
+from pygments.lexers.shell import BashLexer
+from pygments.token import (
+    Comment,
+    Keyword,
+    Name,
+    Operator,
+    Punctuation,
+    String,
+    Text,
+    Whitespace,
+)
+
+__version__ = '1.0'
+
+# A bare BitBake identifier (variable, function or flag name). Allows the
+# characters used by OE-Core variable names (digits, ``-``, ``.``, ``+``).
+_IDENT = r'[A-Za-z_][A-Za-z0-9_\-.+]*'
+
+# Optional OE override chain such as ``:append``, ``:remove``, ``:class-target``
+# or ``:${PN}-doc``. Anchored so it only consumes ``:foo`` runs and never
+# eats the leading ``:`` of the ``:=`` assignment operator.
+_OVERRIDE = r'(?::[A-Za-z0-9_\-.+${}]+)*'
+
+# All BitBake variable assignment operators, ordered so that the longer
+# operators win the regex alternation.
+_ASSIGN = r'(?:\?\?=|\?=|:=|\+=|=\+|\.=|=\.|=)'
+
+
+class BitbakeLexer(RegexLexer):
+    """Lexer for BitBake recipes, classes, includes and configuration."""
+
+    name = 'BitBake'
+    aliases = ['bitbake', 'bb']
+    filenames = ['*.bb', '*.bbclass', '*.bbappend', '*.inc', '*.conf']
+    mimetypes = ['text/x-bitbake']
+
+    flags = re.MULTILINE
+
+    tokens = {
+        'root': [
+            (r'[ \t]+', Whitespace),
+            (r'\n', Whitespace),
+            (r'#.*$', Comment.Single),
+
+            # ``python [name]() { ... }`` blocks (also ``fakeroot python``).
+            # Must be tried before the generic shell function rule so the
+            # ``python`` keyword is not mistaken for a shell function name.
+            (r'(^(?:fakeroot[ \t]+)?)(python)((?:[ \t]+' + _IDENT + r')?)'
+             r'([ \t]*\([ \t]*\)[ \t]*)(\{[ \t]*\n)'
+             r'((?:.*\n)*?)'
+             r'(^\}[ \t]*$)',
+             bygroups(Keyword.Type, Keyword, Name.Function, Text,
+                      Punctuation, using(PythonLexer), Punctuation)),
+
+            # Shell task bodies: ``[fakeroot ]name[:override]() { ... }``.
+            (r'(^(?:fakeroot[ \t]+)?)(' + _IDENT + r')(' + _OVERRIDE + r')'
+             r'([ \t]*\([ \t]*\)[ \t]*)(\{[ \t]*\n)'
+             r'((?:.*\n)*?)'
+             r'(^\}[ \t]*$)',
+             bygroups(Keyword.Type, Name.Function, Name.Decorator, Text,
+                      Punctuation, using(BashLexer), Punctuation)),
+
+            # Top-level python ``def`` blocks; the body is any run of
+            # indented or blank lines following the signature.
+            (r'^def[ \t]+' + _IDENT + r'[ \t]*\([^)]*\)[ \t]*:[ \t]*\n'
+             r'(?:[ \t]+.*\n|\n)+',
+             using(PythonLexer)),
+
+            # ``inherit`` / ``include`` / ``require`` directives.
+            (r'^(inherit|include|require)\b',
+             Keyword.Namespace, 'include-line'),
+
+            # ``addtask`` / ``deltask`` / ``addhandler`` / ``EXPORT_FUNCTIONS``.
+            (r'^(addtask|deltask|addhandler|EXPORT_FUNCTIONS)\b',
+             Keyword, 'statement'),
+
+            # ``VAR[flag] = "value"`` (varflag assignment).
+            (r'^(' + _IDENT + r')(\[)(' + _IDENT + r')(\])([ \t]*)('
+             + _ASSIGN + r')',
+             bygroups(Name.Variable, Punctuation, Name.Attribute,
+                      Punctuation, Whitespace, Operator),
+             'value'),
+
+            # ``[export ]VAR[:override...] OP "value"`` assignments.
+            (r'^(export[ \t]+)?(' + _IDENT + r')(' + _OVERRIDE + r')'
+             r'([ \t]*)(' + _ASSIGN + r')',
+             bygroups(Keyword.Type, Name.Variable, Name.Decorator,
+                      Whitespace, Operator),
+             'value'),
+
+            # Anything else: fall through one character at a time.
+            (r'.', Text),
+        ],
+
+        'include-line': [
+            (r'[ \t]+', Whitespace),
+            (r'\\\n', Text),
+            (r'\n', Whitespace, '#pop'),
+            include('interp'),
+            (r'[^\s$]+', String),
+        ],
+
+        'statement': [
+            (r'[ \t]+', Whitespace),
+            (r'\\\n', Text),
+            (r'\n', Whitespace, '#pop'),
+            (words(('after', 'before'), suffix=r'\b'), Keyword),
+            include('interp'),
+            (r'[^\s$\\]+', Name),
+        ],
+
+        'value': [
+            (r'[ \t]+', Whitespace),
+            (r'\\\n', String.Escape),
+            (r'\n', Whitespace, '#pop'),
+            (r'"', String.Double, 'string-double'),
+            (r"'", String.Single, 'string-single'),
+            include('interp'),
+            (r'[^\s"\'$\\]+', String),
+        ],
+
+        'string-double': [
+            (r'\\\n', String.Escape),
+            (r'\\.', String.Escape),
+            (r'"', String.Double, '#pop'),
+            include('interp'),
+            (r'[^"\\$]+', String.Double),
+        ],
+
+        'string-single': [
+            (r'\\\n', String.Escape),
+            (r'\\.', String.Escape),
+            (r"'", String.Single, '#pop'),
+            include('interp'),
+            (r"[^'\\$]+", String.Single),
+        ],
+
+        'interp': [
+            # ``${@ python expression }`` evaluated by BitBake at parse time.
+            (r'\$\{@', String.Interpol, 'py-interp'),
+            # ``${VAR}`` variable expansion.
+            (r'(\$\{)([A-Za-z0-9_\-:.+/]+)(\})',
+             bygroups(String.Interpol, Name.Variable, String.Interpol)),
+        ],
+
+        'py-interp': [
+            (r'\}', String.Interpol, '#pop'),
+            (r'[^}]+', using(PythonLexer)),
+        ],
+    }
+
+
+def setup(app):
+    """Register the BitBake lexer with Sphinx/Pygments."""
+    from sphinx.highlighting import lexers
+
+    bb_lexer = BitbakeLexer()
+    lexers['bitbake'] = bb_lexer
+    lexers['bb'] = bb_lexer
+
+    return {
+        'version': __version__,
+        'parallel_read_safe': True,
+        'parallel_write_safe': True,
+    }