diff mbox series

python3: Allow to specify which pyc files to keep

Message ID 2c0885e1-8821-4470-8e54-dcad6a241834@hqv.ch
State New
Headers show
Series python3: Allow to specify which pyc files to keep | expand

Commit Message

Lukas Woodtli April 10, 2025, 6:37 a.m. UTC
From d7159cd842fd2a10ba5aeff62877be2ebb2eab87 Mon Sep 17 00:00:00 2001
From: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com>
Date: Mon, 7 Apr 2025 11:55:33 +0200
Subject: [PATCH] python3: Allow to specify which pyc files to keep

The pyc files for a specific optimization level can be kept and are
installed in the final image.

Signed-off-by: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com>
---
  .../recipes-devtools/python/python3_3.13.2.bb | 32 +++++++++++++++----
  1 file changed, 25 insertions(+), 7 deletions(-)

              python3 ${UNPACKDIR}/reformat_sysconfig.py $c
@@ -249,10 +255,25 @@ do_install:append() {
          # so remove it too
          rm -f 
${D}${libdir}/python${PYTHON_MAJMIN}/__pycache__/traceback.cpython*

-        # Remove the opt-1.pyc and opt-2.pyc files. They effectively 
waste space on embedded
-        # style targets as they're only used when python is called with 
the -O or -OO options
-        # which is rare.
-        find ${D} -name *opt-*.pyc -delete
+        if [ "${INCLUDE_PYCS}" -eq "1" ]; then
+             # Remove only the .pyc files of unused optimization level.
+            if [ "${PYCS_OPT_LEVEL}" -eq "0" ]; then
+                # keep only unoptimized .pyc files
+                find ${D} -name *.opt-1.pyc -exec rm -f {} \;
+                find ${D} -name *.opt-2.pyc -exec rm -f {} \;
+            elif [ "${PYCS_OPT_LEVEL}" -eq "1" ]; then
+                # keep only .pyc files with optimization level 1
+                find ${D} -name *.pyc -and -not -name *.opt-1.pyc -exec 
rm -f {} \;
+            elif [ "${PYCS_OPT_LEVEL}" -eq "2" ]; then
+                # keep only .pyc files with optimization level 2
+                find ${D} -name *.pyc -and -not -name *.opt-2.pyc -exec 
rm -f {} \;
+            else
+                bberror "Python optimization level ${PYCS_OPT_LEVEL} is 
not supported"
+            fi
+        else
+            # remove all .pyc files
+            find ${D} -name *.pyc -delete
+        fi
  }

  do_install:append:class-nativesdk () {
@@ -326,9 +347,6 @@ py_package_preprocess () {
          rm -rf ${PKGD}/${libdir}/python-sysconfigdata
  }

-# We want bytecode precompiled .py files (.pyc's) by default
-# but the user may set it on their own conf
-INCLUDE_PYCS ?= "1"

  python(){
      import collections, json

Comments

Alexander Kanavin April 10, 2025, 8:02 a.m. UTC | #1
Should we rather just package the higher level .pyc into their own packages?

Alex


On Thu, 10 Apr 2025 at 08:37, Lukas Woodtli via lists.openembedded.org
<lw=hqv.ch@lists.openembedded.org> wrote:
>
>  From d7159cd842fd2a10ba5aeff62877be2ebb2eab87 Mon Sep 17 00:00:00 2001
> From: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com>
> Date: Mon, 7 Apr 2025 11:55:33 +0200
> Subject: [PATCH] python3: Allow to specify which pyc files to keep
>
> The pyc files for a specific optimization level can be kept and are
> installed in the final image.
>
> Signed-off-by: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com>
> ---
>   .../recipes-devtools/python/python3_3.13.2.bb | 32 +++++++++++++++----
>   1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/meta/recipes-devtools/python/python3_3.13.2.bb
> b/meta/recipes-devtools/python/python3_3.13.2.bb
> index 7c36fd92ed..ac74432a2a 100644
> --- a/meta/recipes-devtools/python/python3_3.13.2.bb
> +++ b/meta/recipes-devtools/python/python3_3.13.2.bb
> @@ -217,6 +217,12 @@ do_install:append:class-native() {
>           mv ${D}/${bindir}/${PN}/python*config ${D}/${bindir}/
>   }
>
> +# We want bytecode precompiled .py files (.pyc's) by default
> +# but the user may set it on their own conf
> +INCLUDE_PYCS ?= "1"
> +
> +PYCS_OPT_LEVEL ?= "0"
> +
>   do_install:append() {
>           for c in
> ${D}/${libdir}/python${PYTHON_MAJMIN}/_sysconfigdata*.py; do
>               python3 ${UNPACKDIR}/reformat_sysconfig.py $c
> @@ -249,10 +255,25 @@ do_install:append() {
>           # so remove it too
>           rm -f
> ${D}${libdir}/python${PYTHON_MAJMIN}/__pycache__/traceback.cpython*
>
> -        # Remove the opt-1.pyc and opt-2.pyc files. They effectively
> waste space on embedded
> -        # style targets as they're only used when python is called with
> the -O or -OO options
> -        # which is rare.
> -        find ${D} -name *opt-*.pyc -delete
> +        if [ "${INCLUDE_PYCS}" -eq "1" ]; then
> +             # Remove only the .pyc files of unused optimization level.
> +            if [ "${PYCS_OPT_LEVEL}" -eq "0" ]; then
> +                # keep only unoptimized .pyc files
> +                find ${D} -name *.opt-1.pyc -exec rm -f {} \;
> +                find ${D} -name *.opt-2.pyc -exec rm -f {} \;
> +            elif [ "${PYCS_OPT_LEVEL}" -eq "1" ]; then
> +                # keep only .pyc files with optimization level 1
> +                find ${D} -name *.pyc -and -not -name *.opt-1.pyc -exec
> rm -f {} \;
> +            elif [ "${PYCS_OPT_LEVEL}" -eq "2" ]; then
> +                # keep only .pyc files with optimization level 2
> +                find ${D} -name *.pyc -and -not -name *.opt-2.pyc -exec
> rm -f {} \;
> +            else
> +                bberror "Python optimization level ${PYCS_OPT_LEVEL} is
> not supported"
> +            fi
> +        else
> +            # remove all .pyc files
> +            find ${D} -name *.pyc -delete
> +        fi
>   }
>
>   do_install:append:class-nativesdk () {
> @@ -326,9 +347,6 @@ py_package_preprocess () {
>           rm -rf ${PKGD}/${libdir}/python-sysconfigdata
>   }
>
> -# We want bytecode precompiled .py files (.pyc's) by default
> -# but the user may set it on their own conf
> -INCLUDE_PYCS ?= "1"
>
>   python(){
>       import collections, json
> --
> 2.43.0
>
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#214621): https://lists.openembedded.org/g/openembedded-core/message/214621
> Mute This Topic: https://lists.openembedded.org/mt/112186823/1686489
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Lukas Woodtli April 10, 2025, 11:36 a.m. UTC | #2
> 
> Should we rather just package the higher level .pyc into their own
> packages?
> 

But this would include 3 additional packages. One for each optimization level.
Embedded devices might have limited storage resources.
Therefore, it makes sense to just include the .pyc files with the necessary optimization level to the image.
And not all of them.

Patch with (hopefully) proper formatting:

From d7159cd842fd2a10ba5aeff62877be2ebb2eab87 Mon Sep 17 00:00:00 2001
From: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com>
Date: Mon, 7 Apr 2025 11:55:33 +0200
Subject: [PATCH] python3: Allow to specify which pyc files to keep

The pyc files for a specific optimization level can be kept and are
installed in the final image.

Signed-off-by: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com>
---
.../recipes-devtools/python/python3_3.13.2.bb | 32 +++++++++++++++----
1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/meta/recipes-devtools/python/python3_3.13.2.bb b/meta/recipes-devtools/python/python3_3.13.2.bb
index 7c36fd92ed..ac74432a2a 100644
--- a/meta/recipes-devtools/python/python3_3.13.2.bb
+++ b/meta/recipes-devtools/python/python3_3.13.2.bb
@@ -217,6 +217,12 @@ do_install:append:class-native() {
mv ${D}/${bindir}/${PN}/python*config ${D}/${bindir}/
}

+# We want bytecode precompiled .py files (.pyc's) by default
+# but the user may set it on their own conf
+INCLUDE_PYCS ?= "1"
+
+PYCS_OPT_LEVEL ?= "0"
+
do_install:append() {
for c in ${D}/${libdir}/python${PYTHON_MAJMIN}/_sysconfigdata*.py; do
python3 ${UNPACKDIR}/reformat_sysconfig.py $c
@@ -249,10 +255,25 @@ do_install:append() {
# so remove it too
rm -f ${D}${libdir}/python${PYTHON_MAJMIN}/__pycache__/traceback.cpython*

-        # Remove the opt-1.pyc and opt-2.pyc files. They effectively waste space on embedded
-        # style targets as they're only used when python is called with the -O or -OO options
-        # which is rare.
-        find ${D} -name *opt-*.pyc -delete
+        if [ "${INCLUDE_PYCS}" -eq "1" ]; then
+             # Remove only the .pyc files of unused optimization level.
+            if [ "${PYCS_OPT_LEVEL}" -eq "0" ]; then
+                # keep only unoptimized .pyc files
+                find ${D} -name *.opt-1.pyc -exec rm -f {} \;
+                find ${D} -name *.opt-2.pyc -exec rm -f {} \;
+            elif [ "${PYCS_OPT_LEVEL}" -eq "1" ]; then
+                # keep only .pyc files with optimization level 1
+                find ${D} -name *.pyc -and -not -name *.opt-1.pyc -exec rm -f {} \;
+            elif [ "${PYCS_OPT_LEVEL}" -eq "2" ]; then
+                # keep only .pyc files with optimization level 2
+                find ${D} -name *.pyc -and -not -name *.opt-2.pyc -exec rm -f {} \;
+            else
+                bberror "Python optimization level ${PYCS_OPT_LEVEL} is not supported"
+            fi
+        else
+            # remove all .pyc files
+            find ${D} -name *.pyc -delete
+        fi
}

do_install:append:class-nativesdk () {
@@ -326,9 +347,6 @@ py_package_preprocess () {
rm -rf ${PKGD}/${libdir}/python-sysconfigdata
}

-# We want bytecode precompiled .py files (.pyc's) by default
-# but the user may set it on their own conf
-INCLUDE_PYCS ?= "1"

python(){
import collections, json
--
2.43.0
Richard Purdie April 10, 2025, 11:58 a.m. UTC | #3
On Thu, 2025-04-10 at 04:36 -0700, Lukas Woodtli via lists.openembedded.org wrote:
>  
> > Should we rather just package the higher level .pyc into their own packages?
>  
>  
> But this would include 3 additional packages. One for each optimization level.
> Embedded devices might have limited storage resources.
> Therefore, it makes sense to just include the .pyc files with the necessary optimization level to the image.
> And not all of them.

The nice thing with packages is that you can then choose to install
them or not. We'd want to default to not for these.

The key question is whether this level of configuration is generally
useful enough to be worth us maintaining it in OE-Core. I don't know if
this is something you needed for a one off investigation or that people
need it in general.

Cheers,

Richard
Alexander Kanavin April 10, 2025, 12:11 p.m. UTC | #4
On Thu, 10 Apr 2025 at 13:58, Richard Purdie via
lists.openembedded.org
<richard.purdie=linuxfoundation.org@lists.openembedded.org> wrote:

> The key question is whether this level of configuration is generally
> useful enough to be worth us maintaining it in OE-Core. I don't know if
> this is something you needed for a one off investigation or that people
> need it in general.

I think it would be good to step back and look at what these
optimization levels actually do. Upstream clearly treats them as
optional opt-in via python interpreter command line switch or
environment variable, which means they don't think picking a higher
default level is something python users should be doing. So why would
we be going against this approach?

Alex
Lukas Woodtli April 10, 2025, 1:05 p.m. UTC | #5
> 
> 
> 
>> The key question is whether this level of configuration is generally
>> useful enough to be worth us maintaining it in OE-Core. I don't know if
>> this is something you needed for a one off investigation or that people
>> need it in general.
> 
> I think it would be good to step back and look at what these
> optimization levels actually do. Upstream clearly treats them as
> optional opt-in via python interpreter command line switch or
> environment variable, which means they don't think picking a higher
> default level is something python users should be doing. So why would
> we be going against this approach?
> 
> 

The reason to use these features of the Python interpreter is that we develop for
resource restricted devices. The target device should not need to compile the Python
files to byte-code at runtime (which happens regardless of the optimization level).
We can do this work at build time. Moreover, Embedded devices usually
have restricted storage size. So using the optimization level 2 for pyc files is sensible.
Most embedded devices don't need the docstrings in the Python files (simmilar to
not needing the man pages). So we can get rid of them (that is what optimization
level 2 does, apart from disabling asserts).

Lukas
Alexander Kanavin April 10, 2025, 2:31 p.m. UTC | #6
On Thu, 10 Apr 2025 at 15:05, Lukas Woodtli via lists.openembedded.org
<lw=hqv.ch@lists.openembedded.org> wrote:
> The reason to use these features of the Python interpreter is that we develop for
> resource restricted devices. The target device should not need to compile the Python
> files to byte-code at runtime (which happens regardless of the optimization level).
> We can do this work at build time. Moreover, Embedded devices usually
> have restricted storage size. So using the optimization level 2 for pyc files is sensible.
> Most embedded devices don't need the docstrings in the Python files (simmilar to
> not needing the man pages). So we can get rid of them (that is what optimization
> level 2 does, apart from disabling asserts).

Most embedded devices also need to be robust. Lest you forget, python
and its extensions are written in C, and asserts prevent what is
over-generously termed 'undefined behavior', with a loud crash. Which
is a feature some people find useful. Can you demonstrate speedups?

Docstrings argument as well needs to be substantiated with numbers.
How much space is actually being saved?

Is there anything else to these optimizations? Is it documented
anywhere? How can we know if it changes over time, with new python
releases? I feel like this needs to be considered far more carefully
than you have so far.

Alex
Mike Looijmans April 11, 2025, 8:55 a.m. UTC | #7
On 10-04-2025 16:31, Alexander Kanavin via lists.openembedded.org wrote:
> On Thu, 10 Apr 2025 at 15:05, Lukas Woodtli via lists.openembedded.org
> <lw=hqv.ch@lists.openembedded.org> wrote:
>> The reason to use these features of the Python interpreter is that we develop for
>> resource restricted devices. The target device should not need to compile the Python
>> files to byte-code at runtime (which happens regardless of the optimization level).
>> We can do this work at build time. Moreover, Embedded devices usually
>> have restricted storage size. So using the optimization level 2 for pyc files is sensible.
>> Most embedded devices don't need the docstrings in the Python files (simmilar to
>> not needing the man pages). So we can get rid of them (that is what optimization
>> level 2 does, apart from disabling asserts).
> Most embedded devices also need to be robust. Lest you forget, python
> and its extensions are written in C, and asserts prevent what is
> over-generously termed 'undefined behavior', with a loud crash. Which
> is a feature some people find useful. Can you demonstrate speedups?
>
> Docstrings argument as well needs to be substantiated with numbers.
> How much space is actually being saved?
>
> Is there anything else to these optimizations? Is it documented
> anywhere? How can we know if it changes over time, with new python
> releases? I feel like this needs to be considered far more carefully
> than you have so far.

Getting a deja-vu feel here... We've used a similar approach in the past for 
OpenPLi, as we had to cram the full DVB settop box GUI firmware into 32MB NAND 
flash for some boxes.

The impact of the "optimization" on runtime performance was negligible, and 
we're talking about 200-400 MHz MIPS CPU systems in those days.

Python's optimization level 2 is sure to break things. Quite a large software 
base uses asserts and docstrings for their internal purposes. One might argue 
that that is "broken", but that's not something that can be fixed from this 
side...

We didn't experience any ill effects from optimization level 1, so the we just 
patched the Python stuff to just always use that, regardless of the file 
extension being pyc.

As for space savings, the best thing was to omit the "py" files. That results 
in more than halving the space required. To accomplish that, we split the 
packages and put all "py" source files into a "-src" package. So on the end 
product, only the pyc files got installed and developers could optionally 
install some -src packages if they wanted to hack on the box itself (the major 
advantage of using Python).

M.
diff mbox series

Patch

diff --git a/meta/recipes-devtools/python/python3_3.13.2.bb 
b/meta/recipes-devtools/python/python3_3.13.2.bb
index 7c36fd92ed..ac74432a2a 100644
--- a/meta/recipes-devtools/python/python3_3.13.2.bb
+++ b/meta/recipes-devtools/python/python3_3.13.2.bb
@@ -217,6 +217,12 @@  do_install:append:class-native() {
          mv ${D}/${bindir}/${PN}/python*config ${D}/${bindir}/
  }

+# We want bytecode precompiled .py files (.pyc's) by default
+# but the user may set it on their own conf
+INCLUDE_PYCS ?= "1"
+
+PYCS_OPT_LEVEL ?= "0"
+
  do_install:append() {
          for c in 
${D}/${libdir}/python${PYTHON_MAJMIN}/_sysconfigdata*.py; do