[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [atomic-devel] [fedora-atomic f23] remove Python source files



On 1 December 2015 at 22:36, Giuseppe Scrivano <gscrivan redhat com> wrote:
> Hi,
>
> I was experimenting with reducing the size of the Atomic Host image and
> it seems that a lot of space is used by Python source files.

This sounds like an idea worth considering to me - we (wearing my
CPython hat) fully support sourceless deployments, but I hadn't
considered the fact that could be used to save space in container
images.

The main downside I see is that it's going to lost a lot of
information from Python tracebacks, and, at least as far as I am
aware, the tooling doesn't currently exist to stitch a sourceless
traceback back together with a suitable source tree to get a readable
traceback again.

> I had to deal differently with the two versions as Python 3 handles
> source-less distributions in a different way than Python 2.
>
> Python 2 simply loads the *.pyc file when the *.py file is missing.
>
> Python 3 requires an additional step, as it puts the precompiled version
> under the __pycache__ directory, but it expects the file to be one level
> upper when the source file is missing:
>
> /foo/__pycache__/test.cpython-34.pyo -> /foo/test.pyc

The Python 3 compileall module supports this directly, by passing the
"-b" option: https://docs.python.org/3/library/compileall.html#command-line-use

> This patch reduces the used disk space by around 55 MB.
>
> Any comments?

> diff --git a/treecompose-post.sh b/treecompose-post.sh
> index 73b6573..39f1ba0 100755
> --- a/treecompose-post.sh
> +++ b/treecompose-post.sh
> @@ -8,3 +8,40 @@ find /usr/share/locale -mindepth  1 -maxdepth 1 -type d -not -name "${KEEPLANG}"
>  localedef --list-archive | grep -a -v ^"${KEEPLANG}" | xargs localedef --delete-from-archive
>  mv -f /usr/lib/locale/locale-archive /usr/lib/locale/locale-archive.tmpl
>  build-locale-archive
> +
> +# Compile all the files.
> +find /usr/lib*/python2.* -type d -exec python2 -OO -m compileall -l {} +
> +find /usr/lib*/python3.* -type d -exec python3 -OO -m compileall -l {} +

If you pass "-b" in the second line, Python 2 & 3 should produce files
in the same places.

However, -OO strips docstrings, which can break some applications -
this is why the Fedora Python packaging guidelines were recently
updated to advise against using -OO for system packages. (Changes to
the way optimisation levels are handled for Python 3.5+ mean that
restriction only applied to 3.4 and earlier, but even for 3.5+, it
would still be a problem when renaming the bytecode files to run even
in the default __debug__ mode).

> +# Here we treat Python 2 and Python 3 differently:
> +
> +# Python 2
> +# *.pyo files are basically *.pyc files, except that when the source
> +# file is missing Python will load only the .pyc file:
> +find /usr/lib*/python2.* -type f -name "*.pyo" | while read i
> +do
> +    destination_pyc=$(echo $i | sed -e's|pyo$|pyc|')
> +    rm -f $destination_pyc
> +    mv $i $destination_pyc
> +done

This renaming would lead to -OO modules being loaded into __debug__
and -O processes, potentially causing problems for code expecting
docstrings to be present. It should be OK if the files are compiled
with the "-O" optimisation level, though.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan gmail com   |   Brisbane, Australia


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]