Message ID | 2c0885e1-8821-4470-8e54-dcad6a241834@hqv.ch |
---|---|
State | New |
Headers | show |
Series | python3: Allow to specify which pyc files to keep | expand |
Should we rather just package the higher level .pyc into their own packages? Alex On Thu, 10 Apr 2025 at 08:37, Lukas Woodtli via lists.openembedded.org <lw=hqv.ch@lists.openembedded.org> wrote: > > From d7159cd842fd2a10ba5aeff62877be2ebb2eab87 Mon Sep 17 00:00:00 2001 > From: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com> > Date: Mon, 7 Apr 2025 11:55:33 +0200 > Subject: [PATCH] python3: Allow to specify which pyc files to keep > > The pyc files for a specific optimization level can be kept and are > installed in the final image. > > Signed-off-by: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com> > --- > .../recipes-devtools/python/python3_3.13.2.bb | 32 +++++++++++++++---- > 1 file changed, 25 insertions(+), 7 deletions(-) > > diff --git a/meta/recipes-devtools/python/python3_3.13.2.bb > b/meta/recipes-devtools/python/python3_3.13.2.bb > index 7c36fd92ed..ac74432a2a 100644 > --- a/meta/recipes-devtools/python/python3_3.13.2.bb > +++ b/meta/recipes-devtools/python/python3_3.13.2.bb > @@ -217,6 +217,12 @@ do_install:append:class-native() { > mv ${D}/${bindir}/${PN}/python*config ${D}/${bindir}/ > } > > +# We want bytecode precompiled .py files (.pyc's) by default > +# but the user may set it on their own conf > +INCLUDE_PYCS ?= "1" > + > +PYCS_OPT_LEVEL ?= "0" > + > do_install:append() { > for c in > ${D}/${libdir}/python${PYTHON_MAJMIN}/_sysconfigdata*.py; do > python3 ${UNPACKDIR}/reformat_sysconfig.py $c > @@ -249,10 +255,25 @@ do_install:append() { > # so remove it too > rm -f > ${D}${libdir}/python${PYTHON_MAJMIN}/__pycache__/traceback.cpython* > > - # Remove the opt-1.pyc and opt-2.pyc files. They effectively > waste space on embedded > - # style targets as they're only used when python is called with > the -O or -OO options > - # which is rare. > - find ${D} -name *opt-*.pyc -delete > + if [ "${INCLUDE_PYCS}" -eq "1" ]; then > + # Remove only the .pyc files of unused optimization level. > + if [ "${PYCS_OPT_LEVEL}" -eq "0" ]; then > + # keep only unoptimized .pyc files > + find ${D} -name *.opt-1.pyc -exec rm -f {} \; > + find ${D} -name *.opt-2.pyc -exec rm -f {} \; > + elif [ "${PYCS_OPT_LEVEL}" -eq "1" ]; then > + # keep only .pyc files with optimization level 1 > + find ${D} -name *.pyc -and -not -name *.opt-1.pyc -exec > rm -f {} \; > + elif [ "${PYCS_OPT_LEVEL}" -eq "2" ]; then > + # keep only .pyc files with optimization level 2 > + find ${D} -name *.pyc -and -not -name *.opt-2.pyc -exec > rm -f {} \; > + else > + bberror "Python optimization level ${PYCS_OPT_LEVEL} is > not supported" > + fi > + else > + # remove all .pyc files > + find ${D} -name *.pyc -delete > + fi > } > > do_install:append:class-nativesdk () { > @@ -326,9 +347,6 @@ py_package_preprocess () { > rm -rf ${PKGD}/${libdir}/python-sysconfigdata > } > > -# We want bytecode precompiled .py files (.pyc's) by default > -# but the user may set it on their own conf > -INCLUDE_PYCS ?= "1" > > python(){ > import collections, json > -- > 2.43.0 > > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#214621): https://lists.openembedded.org/g/openembedded-core/message/214621 > Mute This Topic: https://lists.openembedded.org/mt/112186823/1686489 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [alex.kanavin@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
> > Should we rather just package the higher level .pyc into their own > packages? > But this would include 3 additional packages. One for each optimization level. Embedded devices might have limited storage resources. Therefore, it makes sense to just include the .pyc files with the necessary optimization level to the image. And not all of them. Patch with (hopefully) proper formatting: From d7159cd842fd2a10ba5aeff62877be2ebb2eab87 Mon Sep 17 00:00:00 2001 From: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com> Date: Mon, 7 Apr 2025 11:55:33 +0200 Subject: [PATCH] python3: Allow to specify which pyc files to keep The pyc files for a specific optimization level can be kept and are installed in the final image. Signed-off-by: Lukas Woodtli <lukas.woodtli@husqvarnagroup.com> --- .../recipes-devtools/python/python3_3.13.2.bb | 32 +++++++++++++++---- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/meta/recipes-devtools/python/python3_3.13.2.bb b/meta/recipes-devtools/python/python3_3.13.2.bb index 7c36fd92ed..ac74432a2a 100644 --- a/meta/recipes-devtools/python/python3_3.13.2.bb +++ b/meta/recipes-devtools/python/python3_3.13.2.bb @@ -217,6 +217,12 @@ do_install:append:class-native() { mv ${D}/${bindir}/${PN}/python*config ${D}/${bindir}/ } +# We want bytecode precompiled .py files (.pyc's) by default +# but the user may set it on their own conf +INCLUDE_PYCS ?= "1" + +PYCS_OPT_LEVEL ?= "0" + do_install:append() { for c in ${D}/${libdir}/python${PYTHON_MAJMIN}/_sysconfigdata*.py; do python3 ${UNPACKDIR}/reformat_sysconfig.py $c @@ -249,10 +255,25 @@ do_install:append() { # so remove it too rm -f ${D}${libdir}/python${PYTHON_MAJMIN}/__pycache__/traceback.cpython* - # Remove the opt-1.pyc and opt-2.pyc files. They effectively waste space on embedded - # style targets as they're only used when python is called with the -O or -OO options - # which is rare. - find ${D} -name *opt-*.pyc -delete + if [ "${INCLUDE_PYCS}" -eq "1" ]; then + # Remove only the .pyc files of unused optimization level. + if [ "${PYCS_OPT_LEVEL}" -eq "0" ]; then + # keep only unoptimized .pyc files + find ${D} -name *.opt-1.pyc -exec rm -f {} \; + find ${D} -name *.opt-2.pyc -exec rm -f {} \; + elif [ "${PYCS_OPT_LEVEL}" -eq "1" ]; then + # keep only .pyc files with optimization level 1 + find ${D} -name *.pyc -and -not -name *.opt-1.pyc -exec rm -f {} \; + elif [ "${PYCS_OPT_LEVEL}" -eq "2" ]; then + # keep only .pyc files with optimization level 2 + find ${D} -name *.pyc -and -not -name *.opt-2.pyc -exec rm -f {} \; + else + bberror "Python optimization level ${PYCS_OPT_LEVEL} is not supported" + fi + else + # remove all .pyc files + find ${D} -name *.pyc -delete + fi } do_install:append:class-nativesdk () { @@ -326,9 +347,6 @@ py_package_preprocess () { rm -rf ${PKGD}/${libdir}/python-sysconfigdata } -# We want bytecode precompiled .py files (.pyc's) by default -# but the user may set it on their own conf -INCLUDE_PYCS ?= "1" python(){ import collections, json -- 2.43.0
On Thu, 2025-04-10 at 04:36 -0700, Lukas Woodtli via lists.openembedded.org wrote: > > > Should we rather just package the higher level .pyc into their own packages? > > > But this would include 3 additional packages. One for each optimization level. > Embedded devices might have limited storage resources. > Therefore, it makes sense to just include the .pyc files with the necessary optimization level to the image. > And not all of them. The nice thing with packages is that you can then choose to install them or not. We'd want to default to not for these. The key question is whether this level of configuration is generally useful enough to be worth us maintaining it in OE-Core. I don't know if this is something you needed for a one off investigation or that people need it in general. Cheers, Richard
On Thu, 10 Apr 2025 at 13:58, Richard Purdie via lists.openembedded.org <richard.purdie=linuxfoundation.org@lists.openembedded.org> wrote: > The key question is whether this level of configuration is generally > useful enough to be worth us maintaining it in OE-Core. I don't know if > this is something you needed for a one off investigation or that people > need it in general. I think it would be good to step back and look at what these optimization levels actually do. Upstream clearly treats them as optional opt-in via python interpreter command line switch or environment variable, which means they don't think picking a higher default level is something python users should be doing. So why would we be going against this approach? Alex
> > > >> The key question is whether this level of configuration is generally >> useful enough to be worth us maintaining it in OE-Core. I don't know if >> this is something you needed for a one off investigation or that people >> need it in general. > > I think it would be good to step back and look at what these > optimization levels actually do. Upstream clearly treats them as > optional opt-in via python interpreter command line switch or > environment variable, which means they don't think picking a higher > default level is something python users should be doing. So why would > we be going against this approach? > > The reason to use these features of the Python interpreter is that we develop for resource restricted devices. The target device should not need to compile the Python files to byte-code at runtime (which happens regardless of the optimization level). We can do this work at build time. Moreover, Embedded devices usually have restricted storage size. So using the optimization level 2 for pyc files is sensible. Most embedded devices don't need the docstrings in the Python files (simmilar to not needing the man pages). So we can get rid of them (that is what optimization level 2 does, apart from disabling asserts). Lukas
On Thu, 10 Apr 2025 at 15:05, Lukas Woodtli via lists.openembedded.org <lw=hqv.ch@lists.openembedded.org> wrote: > The reason to use these features of the Python interpreter is that we develop for > resource restricted devices. The target device should not need to compile the Python > files to byte-code at runtime (which happens regardless of the optimization level). > We can do this work at build time. Moreover, Embedded devices usually > have restricted storage size. So using the optimization level 2 for pyc files is sensible. > Most embedded devices don't need the docstrings in the Python files (simmilar to > not needing the man pages). So we can get rid of them (that is what optimization > level 2 does, apart from disabling asserts). Most embedded devices also need to be robust. Lest you forget, python and its extensions are written in C, and asserts prevent what is over-generously termed 'undefined behavior', with a loud crash. Which is a feature some people find useful. Can you demonstrate speedups? Docstrings argument as well needs to be substantiated with numbers. How much space is actually being saved? Is there anything else to these optimizations? Is it documented anywhere? How can we know if it changes over time, with new python releases? I feel like this needs to be considered far more carefully than you have so far. Alex
On 10-04-2025 16:31, Alexander Kanavin via lists.openembedded.org wrote: > On Thu, 10 Apr 2025 at 15:05, Lukas Woodtli via lists.openembedded.org > <lw=hqv.ch@lists.openembedded.org> wrote: >> The reason to use these features of the Python interpreter is that we develop for >> resource restricted devices. The target device should not need to compile the Python >> files to byte-code at runtime (which happens regardless of the optimization level). >> We can do this work at build time. Moreover, Embedded devices usually >> have restricted storage size. So using the optimization level 2 for pyc files is sensible. >> Most embedded devices don't need the docstrings in the Python files (simmilar to >> not needing the man pages). So we can get rid of them (that is what optimization >> level 2 does, apart from disabling asserts). > Most embedded devices also need to be robust. Lest you forget, python > and its extensions are written in C, and asserts prevent what is > over-generously termed 'undefined behavior', with a loud crash. Which > is a feature some people find useful. Can you demonstrate speedups? > > Docstrings argument as well needs to be substantiated with numbers. > How much space is actually being saved? > > Is there anything else to these optimizations? Is it documented > anywhere? How can we know if it changes over time, with new python > releases? I feel like this needs to be considered far more carefully > than you have so far. Getting a deja-vu feel here... We've used a similar approach in the past for OpenPLi, as we had to cram the full DVB settop box GUI firmware into 32MB NAND flash for some boxes. The impact of the "optimization" on runtime performance was negligible, and we're talking about 200-400 MHz MIPS CPU systems in those days. Python's optimization level 2 is sure to break things. Quite a large software base uses asserts and docstrings for their internal purposes. One might argue that that is "broken", but that's not something that can be fixed from this side... We didn't experience any ill effects from optimization level 1, so the we just patched the Python stuff to just always use that, regardless of the file extension being pyc. As for space savings, the best thing was to omit the "py" files. That results in more than halving the space required. To accomplish that, we split the packages and put all "py" source files into a "-src" package. So on the end product, only the pyc files got installed and developers could optionally install some -src packages if they wanted to hack on the box itself (the major advantage of using Python). M.
diff --git a/meta/recipes-devtools/python/python3_3.13.2.bb b/meta/recipes-devtools/python/python3_3.13.2.bb index 7c36fd92ed..ac74432a2a 100644 --- a/meta/recipes-devtools/python/python3_3.13.2.bb +++ b/meta/recipes-devtools/python/python3_3.13.2.bb @@ -217,6 +217,12 @@ do_install:append:class-native() { mv ${D}/${bindir}/${PN}/python*config ${D}/${bindir}/ } +# We want bytecode precompiled .py files (.pyc's) by default +# but the user may set it on their own conf +INCLUDE_PYCS ?= "1" + +PYCS_OPT_LEVEL ?= "0" + do_install:append() { for c in ${D}/${libdir}/python${PYTHON_MAJMIN}/_sysconfigdata*.py; do