diff mbox series

[yocto-docs,v2] Add scripts to build the docs in containers

Message ID 20241205-docs-build-dockerfiles-v2-1-047cb3245adf@bootlin.com
State Under Review
Headers show
Series [yocto-docs,v2] Add scripts to build the docs in containers | expand

Commit Message

Antonin Godard Dec. 5, 2024, 11:06 a.m. UTC
Add two scripts for building a container image and building the
documentation in that image: build-docs-container and
container-build-docs.

For now, the documentation/tools/dockerfiles directory contains a
Dockerfile for building with Ubuntu or Debian, but we could extend this
to the supported distros.

It should be possible to build the full documentation with two commands:

  ./documentation/tools/build-docs-container ubuntu:24.04

CONTAINERCMD can be replaced by "podman" to build with podman.

Signed-off-by: Antonin Godard <antonin.godard@bootlin.com>
---
Changes in v2:
- Merge the scripts to build a container and run the container into a
  single script.
- Parametrize the FROM dockerfile instruction from the script, allowing
  to use a single dockerfile for multiple distros.
- Add a paragraph on this in the README.
- Copy the dependency list in a file in the container, instead of
  passing them in a variable (to be future-proof on command line
  lengths).
- Fix the locale-gen command in the Ubuntu/Debian dockerfile.
- Link to v1: https://lore.kernel.org/r/20241121-docs-build-dockerfiles-v1-1-3b54e1237bf5@bootlin.com
---
 documentation/README                               |  27 +++++
 documentation/tools/build-docs-container           | 119 +++++++++++++++++++++
 .../tools/dockerfiles/Dockerfile.ubuntu-debian     |  18 ++++
 3 files changed, 164 insertions(+)


---
base-commit: 30002019198a168e48537407bb928facb26af82a
change-id: 20241120-docs-build-dockerfiles-07fc9f72cab8
prerequisite-change-id: 20241120-update-doc-deps-1d59abdb2119:v2
prerequisite-patch-id: d6ea594d309248553799e130bf46c297557c9009
prerequisite-patch-id: 84cdb06cfb747a834b6520b09991e5f787862991
prerequisite-patch-id: 5ad97d55512df0eaa9928eb852e0cb8c9dd586ae
prerequisite-patch-id: 39b330354e63d826e14af8168dfc4e3513685052
prerequisite-patch-id: b8811c7a7cc5b6886a037fb535216135064db84e
prerequisite-patch-id: 861769a88e59749479b7e4c34428805ba20651a2

Best regards,

Comments

Quentin Schulz Dec. 6, 2024, 2:23 p.m. UTC | #1
Hi Antonin,

On 12/5/24 12:06 PM, Antonin Godard wrote:
> Add two scripts for building a container image and building the
> documentation in that image: build-docs-container and
> container-build-docs.
> 

Yet there's only one left now :) (also in your commit title).

> For now, the documentation/tools/dockerfiles directory contains a
> Dockerfile for building with Ubuntu or Debian, but we could extend this
> to the supported distros.
> 
> It should be possible to build the full documentation with two commands:
> 
>    ./documentation/tools/build-docs-container ubuntu:24.04
> 
> CONTAINERCMD can be replaced by "podman" to build with podman.
> 

I assume installing podman-docker package could help with that too. Not 
sure it's recommended, but I do have it installed :)

> Signed-off-by: Antonin Godard <antonin.godard@bootlin.com>
> ---

It'd be nice to explicit the dependency on your other patch series here. 
You're the maintainer so you'll get the order right when merging but 
it's a good habit to take as a contributor so maintainers are aware of 
specific inter-series dependencies.

> Changes in v2:
> - Merge the scripts to build a container and run the container into a
>    single script.
> - Parametrize the FROM dockerfile instruction from the script, allowing
>    to use a single dockerfile for multiple distros.
> - Add a paragraph on this in the README.
> - Copy the dependency list in a file in the container, instead of
>    passing them in a variable (to be future-proof on command line
>    lengths).
> - Fix the locale-gen command in the Ubuntu/Debian dockerfile.
> - Link to v1: https://lore.kernel.org/r/20241121-docs-build-dockerfiles-v1-1-3b54e1237bf5@bootlin.com
> ---
>   documentation/README                               |  27 +++++
>   documentation/tools/build-docs-container           | 119 +++++++++++++++++++++
>   .../tools/dockerfiles/Dockerfile.ubuntu-debian     |  18 ++++
>   3 files changed, 164 insertions(+)
> 
> diff --git a/documentation/README b/documentation/README
> index 8a47fd4a3fd07d41d61a7d681d82bd13ac74527d..aebece13758005a8b913b4570fee6118256f6f9b 100644
> --- a/documentation/README
> +++ b/documentation/README
> @@ -128,6 +128,33 @@ dependencies in a virtual environment:
>    $ pipenv install
>    $ pipenv run make html
>   
> +Building the documentation in a container
> +=========================================
> +
> +The documentation can be built in a container with the build-docs-container
> +scripts in the tools/ directory. The default container command used by the
> +script is docker, but the script also supports podman by setting the
> +CONTAINERCMD variable in your environment:
> +
> +  $ export CONTAINERCMD=podman
> +
> +A basic usage of the script would be:
> +
> +  $ ./tools/build-docs-container debian:12
> +
> +This will the entire documentation in an Debian 12 container.
> +

+build?

s/an/a/

> +The documentation can be built in other distributions. This list can be obtained
> +by running:
> +
> +  $ ./tools/build-docs-container list
> +
> +The make target can optionally be provided to build a single documentation type
> +(these targets are listed in the documentation Makefile). For example, to build

s;the documentation Makefile;documentation/Makefike; ?

> +the documentation in HTML and EPub formats, the following command can be used:
> +
> +  $ ./tools/build-docs-container ubuntu:24.04 epub html
> +
>   Style checking the Yocto Project documentation
>   ==============================================
>   
> diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container
> new file mode 100755
> index 0000000000000000000000000000000000000000..b9578d6da49b707c4de215866b6396cfeff3d08b
> --- /dev/null
> +++ b/documentation/tools/build-docs-container
> @@ -0,0 +1,119 @@
> +#!/usr/bin/env bash
> +#
> +# Build a container ready to build the documentation be reading the dependencies
> +# listed in poky.yaml.in, and start a documentation build in this container.
> +#
> +# Usage:
> +#
> +#   ./documentation/tools/build-docs-container <image> [<make target>]
> +#
> +# Where <image> is one of the keys in YQ_KEYS. E.g.:
> +#
> +#   ./documentation/tools/build-docs-container ubuntu:24.04 html
> +#
> +# Will build the docs in an Ubuntu 24.04 container in html.
> +#
> +# The container commands can be used by exporting CONTAINERCMD in the
> +# environment. The default is docker, but podman can also be used.
> +

I would move "set" commands here.

> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
> +CONTAINERCMD=${CONTAINERCMD:-docker}
> +DOCS_DIR="$SCRIPT_DIR/../.."
> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
> +
> +# This lists the different images we can build and the keys we pass to yq to
> +# find packages in poky.yaml.in.
> +# The keys should be in the form of "<image>:<version>", as this will be passed
> +# to FROM in the selected Dockerfile below.
> +# These are common yq keys used for multiple distros.
> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
> +declare -A YQ_KEYS=(
> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
> +)
> +
> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
> +# and the keys listed in DOCKERFILES below.
> +declare -A DOCKERFILES=(
> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
> +  [debian:12]="Dockerfile.ubuntu-debian"
> +)
> +
> +main ()
> +{
> +  if [ "$#" -lt 1 ]; then
> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
> +    exit 1
> +  elif [ "$1" = "list" ]; then
> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
> +    exit 0
> +  fi
> +
> +  local image="$1"
> +  shift
> +  local make_targets="${*:-publish}"
> +
> +  for cmd in $CONTAINERCMD yq; do
> +    if ! which "$cmd" >/dev/null 2>&1; then
> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
> +      exit 1
> +    fi
> +  done
> +
> +  # Get the appropriate dockerfile from DOCKERFILES
> +  dockerfile="${DOCKERFILES[$image]}"
> +
> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
> +  echo -n > "$temporary_dep_file" # empty the file
> +  for dep_key in ${YQ_KEYS[$image]}; do
> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
> +    # add a white space after last element of yq command
> +    echo -n " " >> "$temporary_dep_file"
> +  done
> +

Just use a temporary file, this would allow to run builds from two 
different distros at the same time. mktemp should help you with that.

> +  # docker build doesn't accept 2 colons, so "sanitize" the name
> +  local sanitized_dockername
> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')
> +
> +  $CONTAINERCMD build \
> +    --tag yocto-docs-$sanitized_dockername:latest \
> +    --build-arg ARG_FROM="$image" \
> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
> +    "$SCRIPT_DIR/dockerfiles"
> +
> +  # We can remove the deps file, we no longer need it
> +  rm -f "$temporary_dep_file"
> +
> +  local -a args_run=(
> +    --rm
> +    --interactive
> +    --tty
> +    --volume="$DOCS_DIR:/docs:rw"
> +    --workdir=/docs
> +    --security-opt label=disable
> +  )
> +
> +  if [ "$CONTAINERCMD" = "docker" ]; then
> +    args_run+=(
> +      --user="$(id -u)":"$(id -g)"
> +    )
> +  elif [ "$CONTAINERCMD" = "podman" ]; then
> +    # we need net access to fetch bitbake terms
> +    args_run+=(
> +      --cap-add=NET_RAW
> +      --userns=keep-id
> +    )
> +  fi
> +
> +  $CONTAINERCMD run \
> +    "${args_run[@]}" \
> +    yocto-docs-$sanitized_dockername \
> +    make -C documentation $make_targets
> +}
> +
> +set -eu -o pipefail
> +
> +main "$@"
> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
> new file mode 100644
> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
> --- /dev/null
> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
> @@ -0,0 +1,18 @@
> +ARG ARG_FROM=debian:12 # default value to avoid warning
> +FROM $ARG_FROM
> +
> +ENV DEBIAN_FRONTEND=noninteractive
> +
> +# relative to the location of the dockerfile
> +COPY deps /deps
> +
> +RUN apt-get update \
> + && apt-get --yes --no-install-recommends install $(cat /deps) \
> + && apt-get --yes autoremove \
> + && apt-get clean \
> + && rm /deps
> +
> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
> +
> +RUN git config --global --add safe.directory /docs
> +
> 

So... I've had another idea.

Parsing yaml in shell, not fun.

But I don't think we **need** the instructions to be in the YAML file.

So I think we could drastically simplify all this by storing the setup 
instructions in shell scripts which are included in sphinx AND use those 
shell scripts in the Containerfile. We actually do this for our product 
user manuals :)

Essentially, we would have:

$ cat documentation/host-packages-ubuntu.sh
sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]

In Sphinx:

.. literalinclude:: 50.literalinclude.apt.sh
    :language: bash

In the appropriate place.

How we generate the Containerfile internally:
"""
cat <<EOF > $WORKDIR/merged.sh
#!/bin/bash
set -eux
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get -y install sudo
EOF

# Merge bash snippets we want to run (in the right order)
cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\ 
debos-build-bookworm}.sh >> $WORKDIR/merged.sh

chmod +x $WORKDIR/merged.sh

podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt 
label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
"""

I assume we could simply use COPY in the Containerfile and include the 
shell script in there and run the shell script in a single RUN. You 
could then do the cleanup step in a separate layer, not optimal since 
the layer will be unnecessarily big but I assume the end image is going 
to be smaller anyway?

The benefit is that you test **exactly** what's in the documentation. 
And also you don't have to parse YAML. And it should be relatively easy 
to add support for new distros.

What do you think?

Cheers,
Quentin
Antonin Godard Dec. 10, 2024, 10:33 a.m. UTC | #2
Hi Quentin,

On Fri Dec 6, 2024 at 3:23 PM CET, Quentin Schulz wrote:
> Hi Antonin,
>
> On 12/5/24 12:06 PM, Antonin Godard wrote:
>> Add two scripts for building a container image and building the
>> documentation in that image: build-docs-container and
>> container-build-docs.
>> 
>
> Yet there's only one left now :) (also in your commit title).

Oops, will update in v3 :)

>> For now, the documentation/tools/dockerfiles directory contains a
>> Dockerfile for building with Ubuntu or Debian, but we could extend this
>> to the supported distros.
>> 
>> It should be possible to build the full documentation with two commands:
>> 
>>    ./documentation/tools/build-docs-container ubuntu:24.04
>> 
>> CONTAINERCMD can be replaced by "podman" to build with podman.
>> 
>
> I assume installing podman-docker package could help with that too. Not 
> sure it's recommended, but I do have it installed :)

Didn't know about that one. But it shouldn't be required to use podman or docker
independently I think?

>> Signed-off-by: Antonin Godard <antonin.godard@bootlin.com>
>> ---
>
> It'd be nice to explicit the dependency on your other patch series here. 
> You're the maintainer so you'll get the order right when merging but 
> it's a good habit to take as a contributor so maintainers are aware of 
> specific inter-series dependencies.

You're right, I'll add this (and thanks for the tip, I overlooked that).

>> Changes in v2:
>> - Merge the scripts to build a container and run the container into a
>>    single script.
>> - Parametrize the FROM dockerfile instruction from the script, allowing
>>    to use a single dockerfile for multiple distros.
>> - Add a paragraph on this in the README.
>> - Copy the dependency list in a file in the container, instead of
>>    passing them in a variable (to be future-proof on command line
>>    lengths).
>> - Fix the locale-gen command in the Ubuntu/Debian dockerfile.
>> - Link to v1: https://lore.kernel.org/r/20241121-docs-build-dockerfiles-v1-1-3b54e1237bf5@bootlin.com
>> ---
>>   documentation/README                               |  27 +++++
>>   documentation/tools/build-docs-container           | 119 +++++++++++++++++++++
>>   .../tools/dockerfiles/Dockerfile.ubuntu-debian     |  18 ++++
>>   3 files changed, 164 insertions(+)
>> 
>> diff --git a/documentation/README b/documentation/README
>> index 8a47fd4a3fd07d41d61a7d681d82bd13ac74527d..aebece13758005a8b913b4570fee6118256f6f9b 100644
>> --- a/documentation/README
>> +++ b/documentation/README
>> @@ -128,6 +128,33 @@ dependencies in a virtual environment:
>>    $ pipenv install
>>    $ pipenv run make html
>>   
>> +Building the documentation in a container
>> +=========================================
>> +
>> +The documentation can be built in a container with the build-docs-container
>> +scripts in the tools/ directory. The default container command used by the
>> +script is docker, but the script also supports podman by setting the
>> +CONTAINERCMD variable in your environment:
>> +
>> +  $ export CONTAINERCMD=podman
>> +
>> +A basic usage of the script would be:
>> +
>> +  $ ./tools/build-docs-container debian:12
>> +
>> +This will the entire documentation in an Debian 12 container.
>> +
>
> +build?
>
> s/an/a/
>
>> +The documentation can be built in other distributions. This list can be obtained
>> +by running:
>> +
>> +  $ ./tools/build-docs-container list
>> +
>> +The make target can optionally be provided to build a single documentation type
>> +(these targets are listed in the documentation Makefile). For example, to build
>
> s;the documentation Makefile;documentation/Makefike; ?
>
>> +the documentation in HTML and EPub formats, the following command can be used:
>> +
>> +  $ ./tools/build-docs-container ubuntu:24.04 epub html
>> +
>>   Style checking the Yocto Project documentation
>>   ==============================================
>>   
>> diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container
>> new file mode 100755
>> index 0000000000000000000000000000000000000000..b9578d6da49b707c4de215866b6396cfeff3d08b
>> --- /dev/null
>> +++ b/documentation/tools/build-docs-container
>> @@ -0,0 +1,119 @@
>> +#!/usr/bin/env bash
>> +#
>> +# Build a container ready to build the documentation be reading the dependencies
>> +# listed in poky.yaml.in, and start a documentation build in this container.
>> +#
>> +# Usage:
>> +#
>> +#   ./documentation/tools/build-docs-container <image> [<make target>]
>> +#
>> +# Where <image> is one of the keys in YQ_KEYS. E.g.:
>> +#
>> +#   ./documentation/tools/build-docs-container ubuntu:24.04 html
>> +#
>> +# Will build the docs in an Ubuntu 24.04 container in html.
>> +#
>> +# The container commands can be used by exporting CONTAINERCMD in the
>> +# environment. The default is docker, but podman can also be used.
>> +
>
> I would move "set" commands here.

Agree with this comment and the ones above

>> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
>> +CONTAINERCMD=${CONTAINERCMD:-docker}
>> +DOCS_DIR="$SCRIPT_DIR/../.."
>> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
>> +
>> +# This lists the different images we can build and the keys we pass to yq to
>> +# find packages in poky.yaml.in.
>> +# The keys should be in the form of "<image>:<version>", as this will be passed
>> +# to FROM in the selected Dockerfile below.
>> +# These are common yq keys used for multiple distros.
>> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
>> +declare -A YQ_KEYS=(
>> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
>> +)
>> +
>> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
>> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
>> +# and the keys listed in DOCKERFILES below.
>> +declare -A DOCKERFILES=(
>> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
>> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
>> +  [debian:12]="Dockerfile.ubuntu-debian"
>> +)
>> +
>> +main ()
>> +{
>> +  if [ "$#" -lt 1 ]; then
>> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
>> +    exit 1
>> +  elif [ "$1" = "list" ]; then
>> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
>> +    exit 0
>> +  fi
>> +
>> +  local image="$1"
>> +  shift
>> +  local make_targets="${*:-publish}"
>> +
>> +  for cmd in $CONTAINERCMD yq; do
>> +    if ! which "$cmd" >/dev/null 2>&1; then
>> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
>> +      exit 1
>> +    fi
>> +  done
>> +
>> +  # Get the appropriate dockerfile from DOCKERFILES
>> +  dockerfile="${DOCKERFILES[$image]}"
>> +
>> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
>> +  echo -n > "$temporary_dep_file" # empty the file
>> +  for dep_key in ${YQ_KEYS[$image]}; do
>> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
>> +    # add a white space after last element of yq command
>> +    echo -n " " >> "$temporary_dep_file"
>> +  done
>> +
>
> Just use a temporary file, this would allow to run builds from two 
> different distros at the same time. mktemp should help you with that.

The problem is that the source in COPY must be part of the build context, so I
don't having a temp file in /tmp is possible. See
https://docs.docker.com/reference/dockerfile/#adding-files-from-the-build-context

I could use "mktemp --tmpdir=$SCRIPT_DIR/dockerfiles" but I don't really see the
added value.

>> +  # docker build doesn't accept 2 colons, so "sanitize" the name
>> +  local sanitized_dockername
>> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')
>> +
>> +  $CONTAINERCMD build \
>> +    --tag yocto-docs-$sanitized_dockername:latest \
>> +    --build-arg ARG_FROM="$image" \
>> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
>> +    "$SCRIPT_DIR/dockerfiles"
>> +
>> +  # We can remove the deps file, we no longer need it
>> +  rm -f "$temporary_dep_file"
>> +
>> +  local -a args_run=(
>> +    --rm
>> +    --interactive
>> +    --tty
>> +    --volume="$DOCS_DIR:/docs:rw"
>> +    --workdir=/docs
>> +    --security-opt label=disable
>> +  )
>> +
>> +  if [ "$CONTAINERCMD" = "docker" ]; then
>> +    args_run+=(
>> +      --user="$(id -u)":"$(id -g)"
>> +    )
>> +  elif [ "$CONTAINERCMD" = "podman" ]; then
>> +    # we need net access to fetch bitbake terms
>> +    args_run+=(
>> +      --cap-add=NET_RAW
>> +      --userns=keep-id
>> +    )
>> +  fi
>> +
>> +  $CONTAINERCMD run \
>> +    "${args_run[@]}" \
>> +    yocto-docs-$sanitized_dockername \
>> +    make -C documentation $make_targets
>> +}
>> +
>> +set -eu -o pipefail
>> +
>> +main "$@"
>> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
>> --- /dev/null
>> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>> @@ -0,0 +1,18 @@
>> +ARG ARG_FROM=debian:12 # default value to avoid warning
>> +FROM $ARG_FROM
>> +
>> +ENV DEBIAN_FRONTEND=noninteractive
>> +
>> +# relative to the location of the dockerfile
>> +COPY deps /deps
>> +
>> +RUN apt-get update \
>> + && apt-get --yes --no-install-recommends install $(cat /deps) \
>> + && apt-get --yes autoremove \
>> + && apt-get clean \
>> + && rm /deps
>> +
>> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
>> +
>> +RUN git config --global --add safe.directory /docs
>> +
>> 
>
> So... I've had another idea.
>
> Parsing yaml in shell, not fun.
>
> But I don't think we **need** the instructions to be in the YAML file.
>
> So I think we could drastically simplify all this by storing the setup 
> instructions in shell scripts which are included in sphinx AND use those 
> shell scripts in the Containerfile. We actually do this for our product 
> user manuals :)
>
> Essentially, we would have:
>
> $ cat documentation/host-packages-ubuntu.sh
> sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]
>
> In Sphinx:
>
> .. literalinclude:: 50.literalinclude.apt.sh
>     :language: bash
>
> In the appropriate place.
>
> How we generate the Containerfile internally:
> """
> cat <<EOF > $WORKDIR/merged.sh
> #!/bin/bash
> set -eux
> export DEBIAN_FRONTEND=noninteractive
> apt-get update
> apt-get -y install sudo
> EOF
>
> # Merge bash snippets we want to run (in the right order)
> cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\ 
> debos-build-bookworm}.sh >> $WORKDIR/merged.sh
>
> chmod +x $WORKDIR/merged.sh
>
> podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt 
> label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
> """
>
> I assume we could simply use COPY in the Containerfile and include the 
> shell script in there and run the shell script in a single RUN. You 
> could then do the cleanup step in a separate layer, not optimal since 
> the layer will be unnecessarily big but I assume the end image is going 
> to be smaller anyway?
>
> The benefit is that you test **exactly** what's in the documentation. 
> And also you don't have to parse YAML. And it should be relatively easy 
> to add support for new distros.
>
> What do you think?

I like this idea! Only downside I see is to have to re-run "apt get install" for
each run command, which is not really ideal IMO. I prefer caching what's already
been downloaded so we can re-build the doc swiftly (I imagined this script to be
used for more than verifying that what we list in the dep list is enough to
build, and used by anyone to test their changes to the docs easily).

On the other hand, running exactly what we document is a good thing. So maybe a
mix between the two?

For now, I guess the current solution is satisfying: we get close to what's
documented in the documentation without having the 100% match.

If you have time to send some patches to implement that I'd welcome them, though
:)

Thanks,
Antonin

--
Antonin Godard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Quentin Schulz Dec. 10, 2024, 11:08 a.m. UTC | #3
Hi Antonin,

On 12/10/24 11:33 AM, Antonin Godard via lists.yoctoproject.org wrote:
> Hi Quentin,
> 
> On Fri Dec 6, 2024 at 3:23 PM CET, Quentin Schulz wrote:
>> Hi Antonin,
>>
>> On 12/5/24 12:06 PM, Antonin Godard wrote:
>>> Add two scripts for building a container image and building the
>>> documentation in that image: build-docs-container and
>>> container-build-docs.
>>>
>>
>> Yet there's only one left now :) (also in your commit title).
> 
> Oops, will update in v3 :)
> 
>>> For now, the documentation/tools/dockerfiles directory contains a
>>> Dockerfile for building with Ubuntu or Debian, but we could extend this
>>> to the supported distros.
>>>
>>> It should be possible to build the full documentation with two commands:
>>>
>>>     ./documentation/tools/build-docs-container ubuntu:24.04
>>>
>>> CONTAINERCMD can be replaced by "podman" to build with podman.
>>>
>>
>> I assume installing podman-docker package could help with that too. Not
>> sure it's recommended, but I do have it installed :)
> 
> Didn't know about that one. But it shouldn't be required to use podman or docker
> independently I think?
> 

It's just that when you run `which docker` you would get podman. But we 
shouldn't require the user to install podman-docker for the script to 
work. Just wanted to mention it.

[...]

>>> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
>>> +CONTAINERCMD=${CONTAINERCMD:-docker}
>>> +DOCS_DIR="$SCRIPT_DIR/../.."
>>> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
>>> +
>>> +# This lists the different images we can build and the keys we pass to yq to
>>> +# find packages in poky.yaml.in.
>>> +# The keys should be in the form of "<image>:<version>", as this will be passed
>>> +# to FROM in the selected Dockerfile below.
>>> +# These are common yq keys used for multiple distros.
>>> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
>>> +declare -A YQ_KEYS=(
>>> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>> +)
>>> +
>>> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
>>> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
>>> +# and the keys listed in DOCKERFILES below.
>>> +declare -A DOCKERFILES=(
>>> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
>>> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
>>> +  [debian:12]="Dockerfile.ubuntu-debian"
>>> +)
>>> +
>>> +main ()
>>> +{
>>> +  if [ "$#" -lt 1 ]; then
>>> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
>>> +    exit 1
>>> +  elif [ "$1" = "list" ]; then
>>> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
>>> +    exit 0
>>> +  fi
>>> +
>>> +  local image="$1"
>>> +  shift
>>> +  local make_targets="${*:-publish}"
>>> +
>>> +  for cmd in $CONTAINERCMD yq; do
>>> +    if ! which "$cmd" >/dev/null 2>&1; then
>>> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
>>> +      exit 1
>>> +    fi
>>> +  done
>>> +
>>> +  # Get the appropriate dockerfile from DOCKERFILES
>>> +  dockerfile="${DOCKERFILES[$image]}"
>>> +
>>> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
>>> +  echo -n > "$temporary_dep_file" # empty the file
>>> +  for dep_key in ${YQ_KEYS[$image]}; do
>>> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
>>> +    # add a white space after last element of yq command
>>> +    echo -n " " >> "$temporary_dep_file"
>>> +  done
>>> +
>>
>> Just use a temporary file, this would allow to run builds from two
>> different distros at the same time. mktemp should help you with that.
> 
> The problem is that the source in COPY must be part of the build context, so I
> don't having a temp file in /tmp is possible. See
> https://docs.docker.com/reference/dockerfile/#adding-files-from-the-build-context
> 
> I could use "mktemp --tmpdir=$SCRIPT_DIR/dockerfiles" but I don't really see the
> added value.
> 

"this would allow to run builds from two different distros at the same time"

Also, it's guaranteed that an old file wouldn't be used in case the 
script is somehow broken.

>>> +  # docker build doesn't accept 2 colons, so "sanitize" the name
>>> +  local sanitized_dockername
>>> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')
>>> +
>>> +  $CONTAINERCMD build \
>>> +    --tag yocto-docs-$sanitized_dockername:latest \
>>> +    --build-arg ARG_FROM="$image" \
>>> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
>>> +    "$SCRIPT_DIR/dockerfiles"
>>> +
>>> +  # We can remove the deps file, we no longer need it
>>> +  rm -f "$temporary_dep_file"
>>> +
>>> +  local -a args_run=(
>>> +    --rm
>>> +    --interactive
>>> +    --tty
>>> +    --volume="$DOCS_DIR:/docs:rw"
>>> +    --workdir=/docs
>>> +    --security-opt label=disable
>>> +  )
>>> +
>>> +  if [ "$CONTAINERCMD" = "docker" ]; then
>>> +    args_run+=(
>>> +      --user="$(id -u)":"$(id -g)"
>>> +    )
>>> +  elif [ "$CONTAINERCMD" = "podman" ]; then
>>> +    # we need net access to fetch bitbake terms
>>> +    args_run+=(
>>> +      --cap-add=NET_RAW
>>> +      --userns=keep-id
>>> +    )
>>> +  fi
>>> +
>>> +  $CONTAINERCMD run \
>>> +    "${args_run[@]}" \
>>> +    yocto-docs-$sanitized_dockername \
>>> +    make -C documentation $make_targets
>>> +}
>>> +
>>> +set -eu -o pipefail
>>> +
>>> +main "$@"
>>> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>> new file mode 100644
>>> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
>>> --- /dev/null
>>> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>> @@ -0,0 +1,18 @@
>>> +ARG ARG_FROM=debian:12 # default value to avoid warning
>>> +FROM $ARG_FROM
>>> +
>>> +ENV DEBIAN_FRONTEND=noninteractive
>>> +
>>> +# relative to the location of the dockerfile
>>> +COPY deps /deps
>>> +
>>> +RUN apt-get update \
>>> + && apt-get --yes --no-install-recommends install $(cat /deps) \
>>> + && apt-get --yes autoremove \
>>> + && apt-get clean \
>>> + && rm /deps
>>> +
>>> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
>>> +
>>> +RUN git config --global --add safe.directory /docs
>>> +
>>>
>>
>> So... I've had another idea.
>>
>> Parsing yaml in shell, not fun.
>>
>> But I don't think we **need** the instructions to be in the YAML file.
>>
>> So I think we could drastically simplify all this by storing the setup
>> instructions in shell scripts which are included in sphinx AND use those
>> shell scripts in the Containerfile. We actually do this for our product
>> user manuals :)
>>
>> Essentially, we would have:
>>
>> $ cat documentation/host-packages-ubuntu.sh
>> sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]
>>
>> In Sphinx:
>>
>> .. literalinclude:: 50.literalinclude.apt.sh
>>      :language: bash
>>
>> In the appropriate place.
>>
>> How we generate the Containerfile internally:
>> """
>> cat <<EOF > $WORKDIR/merged.sh
>> #!/bin/bash
>> set -eux
>> export DEBIAN_FRONTEND=noninteractive
>> apt-get update
>> apt-get -y install sudo
>> EOF
>>
>> # Merge bash snippets we want to run (in the right order)
>> cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\
>> debos-build-bookworm}.sh >> $WORKDIR/merged.sh
>>
>> chmod +x $WORKDIR/merged.sh
>>
>> podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt
>> label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
>> """
>>
>> I assume we could simply use COPY in the Containerfile and include the
>> shell script in there and run the shell script in a single RUN. You
>> could then do the cleanup step in a separate layer, not optimal since
>> the layer will be unnecessarily big but I assume the end image is going
>> to be smaller anyway?
>>
>> The benefit is that you test **exactly** what's in the documentation.
>> And also you don't have to parse YAML. And it should be relatively easy
>> to add support for new distros.
>>
>> What do you think?
> 
> I like this idea! Only downside I see is to have to re-run "apt get install" for
> each run command, which is not really ideal IMO. I prefer caching what's already

No? I have one RUN command which is "execute this shell script". 
Whatever is executed as part of this shell script is part of one layer. 
You could even have one RUN command which is apt-get update && 
my-shell-script && apt-get cache remove whatever to make this a smaller 
layer.

> been downloaded so we can re-build the doc swiftly (I imagined this script to be
> used for more than verifying that what we list in the dep list is enough to
> build, and used by anyone to test their changes to the docs easily).
> 

You can have as many scripts as you want? And have them do as many or 
little things as you want, I'm not sure to follow what you have in mind 
here?

Cheers,
Quentin
Antonin Godard Dec. 10, 2024, 12:46 p.m. UTC | #4
Hi Quentin,

On Tue Dec 10, 2024 at 12:08 PM CET, Quentin Schulz wrote:
> Hi Antonin,
>
> On 12/10/24 11:33 AM, Antonin Godard via lists.yoctoproject.org wrote:
>> Hi Quentin,
>> 
>> On Fri Dec 6, 2024 at 3:23 PM CET, Quentin Schulz wrote:
>>> Hi Antonin,
>>>
>>> On 12/5/24 12:06 PM, Antonin Godard wrote:
>>>> Add two scripts for building a container image and building the
>>>> documentation in that image: build-docs-container and
>>>> container-build-docs.
>>>>
>>>
>>> Yet there's only one left now :) (also in your commit title).
>> 
>> Oops, will update in v3 :)
>> 
>>>> For now, the documentation/tools/dockerfiles directory contains a
>>>> Dockerfile for building with Ubuntu or Debian, but we could extend this
>>>> to the supported distros.
>>>>
>>>> It should be possible to build the full documentation with two commands:
>>>>
>>>>     ./documentation/tools/build-docs-container ubuntu:24.04
>>>>
>>>> CONTAINERCMD can be replaced by "podman" to build with podman.
>>>>
>>>
>>> I assume installing podman-docker package could help with that too. Not
>>> sure it's recommended, but I do have it installed :)
>> 
>> Didn't know about that one. But it shouldn't be required to use podman or docker
>> independently I think?
>> 
>
> It's just that when you run `which docker` you would get podman. But we 
> shouldn't require the user to install podman-docker for the script to 
> work. Just wanted to mention it.
>
> [...]
>
>>>> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
>>>> +CONTAINERCMD=${CONTAINERCMD:-docker}
>>>> +DOCS_DIR="$SCRIPT_DIR/../.."
>>>> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
>>>> +
>>>> +# This lists the different images we can build and the keys we pass to yq to
>>>> +# find packages in poky.yaml.in.
>>>> +# The keys should be in the form of "<image>:<version>", as this will be passed
>>>> +# to FROM in the selected Dockerfile below.
>>>> +# These are common yq keys used for multiple distros.
>>>> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
>>>> +declare -A YQ_KEYS=(
>>>> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>> +)
>>>> +
>>>> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
>>>> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
>>>> +# and the keys listed in DOCKERFILES below.
>>>> +declare -A DOCKERFILES=(
>>>> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
>>>> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
>>>> +  [debian:12]="Dockerfile.ubuntu-debian"
>>>> +)
>>>> +
>>>> +main ()
>>>> +{
>>>> +  if [ "$#" -lt 1 ]; then
>>>> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
>>>> +    exit 1
>>>> +  elif [ "$1" = "list" ]; then
>>>> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
>>>> +    exit 0
>>>> +  fi
>>>> +
>>>> +  local image="$1"
>>>> +  shift
>>>> +  local make_targets="${*:-publish}"
>>>> +
>>>> +  for cmd in $CONTAINERCMD yq; do
>>>> +    if ! which "$cmd" >/dev/null 2>&1; then
>>>> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
>>>> +      exit 1
>>>> +    fi
>>>> +  done
>>>> +
>>>> +  # Get the appropriate dockerfile from DOCKERFILES
>>>> +  dockerfile="${DOCKERFILES[$image]}"
>>>> +
>>>> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
>>>> +  echo -n > "$temporary_dep_file" # empty the file
>>>> +  for dep_key in ${YQ_KEYS[$image]}; do
>>>> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
>>>> +    # add a white space after last element of yq command
>>>> +    echo -n " " >> "$temporary_dep_file"
>>>> +  done
>>>> +
>>>
>>> Just use a temporary file, this would allow to run builds from two
>>> different distros at the same time. mktemp should help you with that.
>> 
>> The problem is that the source in COPY must be part of the build context, so I
>> don't having a temp file in /tmp is possible. See
>> https://docs.docker.com/reference/dockerfile/#adding-files-from-the-build-context
>> 
>> I could use "mktemp --tmpdir=$SCRIPT_DIR/dockerfiles" but I don't really see the
>> added value.
>> 
>
> "this would allow to run builds from two different distros at the same time"
>
> Also, it's guaranteed that an old file wouldn't be used in case the 
> script is somehow broken.

I see what you mean. However, since mktemp create a random name it creates a
new layer each time, invalidating the following commands in the container file.

Instead I propose to move this above:

>>>> +  # docker build doesn't accept 2 colons, so "sanitize" the name
>>>> +  local sanitized_dockername
>>>> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')

And do this:

  temporary_dep_file="$SCRIPT_DIR/dockerfiles/$sanitized_dockername.deps"

So we avoid the distro clash and we still use cached layers.

>>>> +
>>>> +  $CONTAINERCMD build \
>>>> +    --tag yocto-docs-$sanitized_dockername:latest \
>>>> +    --build-arg ARG_FROM="$image" \
>>>> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
>>>> +    "$SCRIPT_DIR/dockerfiles"
>>>> +
>>>> +  # We can remove the deps file, we no longer need it
>>>> +  rm -f "$temporary_dep_file"
>>>> +
>>>> +  local -a args_run=(
>>>> +    --rm
>>>> +    --interactive
>>>> +    --tty
>>>> +    --volume="$DOCS_DIR:/docs:rw"
>>>> +    --workdir=/docs
>>>> +    --security-opt label=disable
>>>> +  )
>>>> +
>>>> +  if [ "$CONTAINERCMD" = "docker" ]; then
>>>> +    args_run+=(
>>>> +      --user="$(id -u)":"$(id -g)"
>>>> +    )
>>>> +  elif [ "$CONTAINERCMD" = "podman" ]; then
>>>> +    # we need net access to fetch bitbake terms
>>>> +    args_run+=(
>>>> +      --cap-add=NET_RAW
>>>> +      --userns=keep-id
>>>> +    )
>>>> +  fi
>>>> +
>>>> +  $CONTAINERCMD run \
>>>> +    "${args_run[@]}" \
>>>> +    yocto-docs-$sanitized_dockername \
>>>> +    make -C documentation $make_targets
>>>> +}
>>>> +
>>>> +set -eu -o pipefail
>>>> +
>>>> +main "$@"
>>>> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>> new file mode 100644
>>>> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
>>>> --- /dev/null
>>>> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>> @@ -0,0 +1,18 @@
>>>> +ARG ARG_FROM=debian:12 # default value to avoid warning
>>>> +FROM $ARG_FROM
>>>> +
>>>> +ENV DEBIAN_FRONTEND=noninteractive
>>>> +
>>>> +# relative to the location of the dockerfile
>>>> +COPY deps /deps
>>>> +
>>>> +RUN apt-get update \
>>>> + && apt-get --yes --no-install-recommends install $(cat /deps) \
>>>> + && apt-get --yes autoremove \
>>>> + && apt-get clean \
>>>> + && rm /deps
>>>> +
>>>> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
>>>> +
>>>> +RUN git config --global --add safe.directory /docs
>>>> +
>>>>
>>>
>>> So... I've had another idea.
>>>
>>> Parsing yaml in shell, not fun.
>>>
>>> But I don't think we **need** the instructions to be in the YAML file.
>>>
>>> So I think we could drastically simplify all this by storing the setup
>>> instructions in shell scripts which are included in sphinx AND use those
>>> shell scripts in the Containerfile. We actually do this for our product
>>> user manuals :)
>>>
>>> Essentially, we would have:
>>>
>>> $ cat documentation/host-packages-ubuntu.sh
>>> sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]
>>>
>>> In Sphinx:
>>>
>>> .. literalinclude:: 50.literalinclude.apt.sh
>>>      :language: bash
>>>
>>> In the appropriate place.
>>>
>>> How we generate the Containerfile internally:
>>> """
>>> cat <<EOF > $WORKDIR/merged.sh
>>> #!/bin/bash
>>> set -eux
>>> export DEBIAN_FRONTEND=noninteractive
>>> apt-get update
>>> apt-get -y install sudo
>>> EOF
>>>
>>> # Merge bash snippets we want to run (in the right order)
>>> cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\
>>> debos-build-bookworm}.sh >> $WORKDIR/merged.sh
>>>
>>> chmod +x $WORKDIR/merged.sh
>>>
>>> podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt
>>> label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
>>> """
>>>
>>> I assume we could simply use COPY in the Containerfile and include the
>>> shell script in there and run the shell script in a single RUN. You
>>> could then do the cleanup step in a separate layer, not optimal since
>>> the layer will be unnecessarily big but I assume the end image is going
>>> to be smaller anyway?
>>>
>>> The benefit is that you test **exactly** what's in the documentation.
>>> And also you don't have to parse YAML. And it should be relatively easy
>>> to add support for new distros.
>>>
>>> What do you think?
>> 
>> I like this idea! Only downside I see is to have to re-run "apt get install" for
>> each run command, which is not really ideal IMO. I prefer caching what's already
>
> No? I have one RUN command which is "execute this shell script". 
> Whatever is executed as part of this shell script is part of one layer. 
> You could even have one RUN command which is apt-get update && 
> my-shell-script && apt-get cache remove whatever to make this a smaller 
> layer.

So let me know if I got this wrong, but running "podman run ..." multiple times
is going to make use of a cached layer automagically? I.e. merged.sh is not
re-run each time? Or you are implying that there is still a Containerfile in the
process that caches this data?

>> been downloaded so we can re-build the doc swiftly (I imagined this script to be
>> used for more than verifying that what we list in the dep list is enough to
>> build, and used by anyone to test their changes to the docs easily).
>> 
>
> You can have as many scripts as you want? And have them do as many or 
> little things as you want, I'm not sure to follow what you have in mind 
> here?

That wasn't clear maybe. I simply meant to say that I want to publish this
script not only for verifying that listed dependencies are correct, but also
because it's convenient for anyone to build the documentation. It's unrelated to
your solution.

Thanks,
Antonin
Quentin Schulz Dec. 10, 2024, 1:48 p.m. UTC | #5
Hi Antonin,

On 12/10/24 1:46 PM, Antonin Godard wrote:
> Hi Quentin,
> 
> On Tue Dec 10, 2024 at 12:08 PM CET, Quentin Schulz wrote:
>> Hi Antonin,
>>
>> On 12/10/24 11:33 AM, Antonin Godard via lists.yoctoproject.org wrote:
>>> Hi Quentin,
>>>
>>> On Fri Dec 6, 2024 at 3:23 PM CET, Quentin Schulz wrote:
>>>> Hi Antonin,
>>>>
>>>> On 12/5/24 12:06 PM, Antonin Godard wrote:
>>>>> Add two scripts for building a container image and building the
>>>>> documentation in that image: build-docs-container and
>>>>> container-build-docs.
>>>>>
>>>>
>>>> Yet there's only one left now :) (also in your commit title).
>>>
>>> Oops, will update in v3 :)
>>>
>>>>> For now, the documentation/tools/dockerfiles directory contains a
>>>>> Dockerfile for building with Ubuntu or Debian, but we could extend this
>>>>> to the supported distros.
>>>>>
>>>>> It should be possible to build the full documentation with two commands:
>>>>>
>>>>>      ./documentation/tools/build-docs-container ubuntu:24.04
>>>>>
>>>>> CONTAINERCMD can be replaced by "podman" to build with podman.
>>>>>
>>>>
>>>> I assume installing podman-docker package could help with that too. Not
>>>> sure it's recommended, but I do have it installed :)
>>>
>>> Didn't know about that one. But it shouldn't be required to use podman or docker
>>> independently I think?
>>>
>>
>> It's just that when you run `which docker` you would get podman. But we
>> shouldn't require the user to install podman-docker for the script to
>> work. Just wanted to mention it.
>>
>> [...]
>>
>>>>> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
>>>>> +CONTAINERCMD=${CONTAINERCMD:-docker}
>>>>> +DOCS_DIR="$SCRIPT_DIR/../.."
>>>>> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
>>>>> +
>>>>> +# This lists the different images we can build and the keys we pass to yq to
>>>>> +# find packages in poky.yaml.in.
>>>>> +# The keys should be in the form of "<image>:<version>", as this will be passed
>>>>> +# to FROM in the selected Dockerfile below.
>>>>> +# These are common yq keys used for multiple distros.
>>>>> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
>>>>> +declare -A YQ_KEYS=(
>>>>> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>>> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>>> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>>> +)
>>>>> +
>>>>> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
>>>>> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
>>>>> +# and the keys listed in DOCKERFILES below.
>>>>> +declare -A DOCKERFILES=(
>>>>> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
>>>>> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
>>>>> +  [debian:12]="Dockerfile.ubuntu-debian"
>>>>> +)
>>>>> +
>>>>> +main ()
>>>>> +{
>>>>> +  if [ "$#" -lt 1 ]; then
>>>>> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
>>>>> +    exit 1
>>>>> +  elif [ "$1" = "list" ]; then
>>>>> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
>>>>> +    exit 0
>>>>> +  fi
>>>>> +
>>>>> +  local image="$1"
>>>>> +  shift
>>>>> +  local make_targets="${*:-publish}"
>>>>> +
>>>>> +  for cmd in $CONTAINERCMD yq; do
>>>>> +    if ! which "$cmd" >/dev/null 2>&1; then
>>>>> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
>>>>> +      exit 1
>>>>> +    fi
>>>>> +  done
>>>>> +
>>>>> +  # Get the appropriate dockerfile from DOCKERFILES
>>>>> +  dockerfile="${DOCKERFILES[$image]}"
>>>>> +
>>>>> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
>>>>> +  echo -n > "$temporary_dep_file" # empty the file
>>>>> +  for dep_key in ${YQ_KEYS[$image]}; do
>>>>> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
>>>>> +    # add a white space after last element of yq command
>>>>> +    echo -n " " >> "$temporary_dep_file"
>>>>> +  done
>>>>> +
>>>>
>>>> Just use a temporary file, this would allow to run builds from two
>>>> different distros at the same time. mktemp should help you with that.
>>>
>>> The problem is that the source in COPY must be part of the build context, so I
>>> don't having a temp file in /tmp is possible. See
>>> https://docs.docker.com/reference/dockerfile/#adding-files-from-the-build-context
>>>
>>> I could use "mktemp --tmpdir=$SCRIPT_DIR/dockerfiles" but I don't really see the
>>> added value.
>>>
>>
>> "this would allow to run builds from two different distros at the same time"
>>
>> Also, it's guaranteed that an old file wouldn't be used in case the
>> script is somehow broken.
> 
> I see what you mean. However, since mktemp create a random name it creates a
> new layer each time, invalidating the following commands in the container file.
> 
> Instead I propose to move this above:
> 
>>>>> +  # docker build doesn't accept 2 colons, so "sanitize" the name
>>>>> +  local sanitized_dockername
>>>>> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')
> 
> And do this:
> 
>    temporary_dep_file="$SCRIPT_DIR/dockerfiles/$sanitized_dockername.deps"
> 
> So we avoid the distro clash and we still use cached layers.
> 
>>>>> +
>>>>> +  $CONTAINERCMD build \
>>>>> +    --tag yocto-docs-$sanitized_dockername:latest \
>>>>> +    --build-arg ARG_FROM="$image" \
>>>>> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
>>>>> +    "$SCRIPT_DIR/dockerfiles"
>>>>> +
>>>>> +  # We can remove the deps file, we no longer need it
>>>>> +  rm -f "$temporary_dep_file"
>>>>> +
>>>>> +  local -a args_run=(
>>>>> +    --rm
>>>>> +    --interactive
>>>>> +    --tty
>>>>> +    --volume="$DOCS_DIR:/docs:rw"
>>>>> +    --workdir=/docs
>>>>> +    --security-opt label=disable
>>>>> +  )
>>>>> +
>>>>> +  if [ "$CONTAINERCMD" = "docker" ]; then
>>>>> +    args_run+=(
>>>>> +      --user="$(id -u)":"$(id -g)"
>>>>> +    )
>>>>> +  elif [ "$CONTAINERCMD" = "podman" ]; then
>>>>> +    # we need net access to fetch bitbake terms
>>>>> +    args_run+=(
>>>>> +      --cap-add=NET_RAW
>>>>> +      --userns=keep-id
>>>>> +    )
>>>>> +  fi
>>>>> +
>>>>> +  $CONTAINERCMD run \
>>>>> +    "${args_run[@]}" \
>>>>> +    yocto-docs-$sanitized_dockername \
>>>>> +    make -C documentation $make_targets
>>>>> +}
>>>>> +
>>>>> +set -eu -o pipefail
>>>>> +
>>>>> +main "$@"
>>>>> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>>> new file mode 100644
>>>>> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
>>>>> --- /dev/null
>>>>> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>>> @@ -0,0 +1,18 @@
>>>>> +ARG ARG_FROM=debian:12 # default value to avoid warning
>>>>> +FROM $ARG_FROM
>>>>> +
>>>>> +ENV DEBIAN_FRONTEND=noninteractive
>>>>> +
>>>>> +# relative to the location of the dockerfile
>>>>> +COPY deps /deps
>>>>> +
>>>>> +RUN apt-get update \
>>>>> + && apt-get --yes --no-install-recommends install $(cat /deps) \
>>>>> + && apt-get --yes autoremove \
>>>>> + && apt-get clean \
>>>>> + && rm /deps
>>>>> +
>>>>> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
>>>>> +
>>>>> +RUN git config --global --add safe.directory /docs
>>>>> +
>>>>>
>>>>
>>>> So... I've had another idea.
>>>>
>>>> Parsing yaml in shell, not fun.
>>>>
>>>> But I don't think we **need** the instructions to be in the YAML file.
>>>>
>>>> So I think we could drastically simplify all this by storing the setup
>>>> instructions in shell scripts which are included in sphinx AND use those
>>>> shell scripts in the Containerfile. We actually do this for our product
>>>> user manuals :)
>>>>
>>>> Essentially, we would have:
>>>>
>>>> $ cat documentation/host-packages-ubuntu.sh
>>>> sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]
>>>>
>>>> In Sphinx:
>>>>
>>>> .. literalinclude:: 50.literalinclude.apt.sh
>>>>       :language: bash
>>>>
>>>> In the appropriate place.
>>>>
>>>> How we generate the Containerfile internally:
>>>> """
>>>> cat <<EOF > $WORKDIR/merged.sh
>>>> #!/bin/bash
>>>> set -eux
>>>> export DEBIAN_FRONTEND=noninteractive
>>>> apt-get update
>>>> apt-get -y install sudo
>>>> EOF
>>>>
>>>> # Merge bash snippets we want to run (in the right order)
>>>> cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\
>>>> debos-build-bookworm}.sh >> $WORKDIR/merged.sh
>>>>
>>>> chmod +x $WORKDIR/merged.sh
>>>>
>>>> podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt
>>>> label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
>>>> """
>>>>
>>>> I assume we could simply use COPY in the Containerfile and include the
>>>> shell script in there and run the shell script in a single RUN. You
>>>> could then do the cleanup step in a separate layer, not optimal since
>>>> the layer will be unnecessarily big but I assume the end image is going
>>>> to be smaller anyway?
>>>>
>>>> The benefit is that you test **exactly** what's in the documentation.
>>>> And also you don't have to parse YAML. And it should be relatively easy
>>>> to add support for new distros.
>>>>
>>>> What do you think?
>>>
>>> I like this idea! Only downside I see is to have to re-run "apt get install" for
>>> each run command, which is not really ideal IMO. I prefer caching what's already
>>
>> No? I have one RUN command which is "execute this shell script".
>> Whatever is executed as part of this shell script is part of one layer.
>> You could even have one RUN command which is apt-get update &&
>> my-shell-script && apt-get cache remove whatever to make this a smaller
>> layer.
> 
> So let me know if I got this wrong, but running "podman run ..." multiple times
> is going to make use of a cached layer automagically? I.e. merged.sh is not
> re-run each time? Or you are implying that there is still a Containerfile in the
> process that caches this data?
> 

OK, so I actually didn't write what I was thinking about :)

To answer your question, podman run cannot cache what's running in there 
with the above command, so it's run every time. This is fine for our 
manual because we do want to run those commands every time and not rely 
on caching (e.g. because we build the HEAD of some git repo and that 
**is** desired).

But we could simply generate this merged.sh script before podman build 
and pass it inside the Containerfile so it makes it as a layer.

Obviously, any change to the shell script would trigger the COPY layer 
to change and everything after be rebuilt. So the cache will work until 
the shell script content (mtime actually I believe?) changes.

>>> been downloaded so we can re-build the doc swiftly (I imagined this script to be
>>> used for more than verifying that what we list in the dep list is enough to
>>> build, and used by anyone to test their changes to the docs easily).
>>>
>>
>> You can have as many scripts as you want? And have them do as many or
>> little things as you want, I'm not sure to follow what you have in mind
>> here?
> 
> That wasn't clear maybe. I simply meant to say that I want to publish this
> script not only for verifying that listed dependencies are correct, but also
> because it's convenient for anyone to build the documentation. It's unrelated to
> your solution.
> 

I'm working on something right now, will send as RFC so you can comment 
on it. Should happen today or tomorrow.

BTW, I'm unable to build the docs on openSUSE Leap 15.4/15.5/15.6. It 
seems it's always trying to build from python 3.6 and something breaks 
in the sphinx part very early:

0efb6adca777:/docs # make -C documentation/ html
make: Entering directory '/docs/documentation'
./set_versions.py
Traceback (most recent call last):
   File "./set_versions.py", line 102, in <module>
     subprocess.run(["git", "show", "yocto-%s" % 
release_series[activereleases[0]]], capture_output=True, check=True)
   File "/usr/lib64/python3.6/subprocess.py", line 423, in run
     with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'capture_output'
make: *** [Makefile:78: html] Error 1
make: Leaving directory '/docs/documentation'

Cheers,
Quentin
Antonin Godard Dec. 10, 2024, 2:48 p.m. UTC | #6
Hi Quentin,

On Tue Dec 10, 2024 at 2:48 PM CET, Quentin Schulz wrote:
> Hi Antonin,
>
> On 12/10/24 1:46 PM, Antonin Godard wrote:
>> Hi Quentin,
>> 
>> On Tue Dec 10, 2024 at 12:08 PM CET, Quentin Schulz wrote:
>>> Hi Antonin,
>>>
>>> On 12/10/24 11:33 AM, Antonin Godard via lists.yoctoproject.org wrote:
>>>> Hi Quentin,
>>>>
>>>> On Fri Dec 6, 2024 at 3:23 PM CET, Quentin Schulz wrote:
>>>>> Hi Antonin,
>>>>>
>>>>> On 12/5/24 12:06 PM, Antonin Godard wrote:
>>>>>> Add two scripts for building a container image and building the
>>>>>> documentation in that image: build-docs-container and
>>>>>> container-build-docs.
>>>>>>
>>>>>
>>>>> Yet there's only one left now :) (also in your commit title).
>>>>
>>>> Oops, will update in v3 :)
>>>>
>>>>>> For now, the documentation/tools/dockerfiles directory contains a
>>>>>> Dockerfile for building with Ubuntu or Debian, but we could extend this
>>>>>> to the supported distros.
>>>>>>
>>>>>> It should be possible to build the full documentation with two commands:
>>>>>>
>>>>>>      ./documentation/tools/build-docs-container ubuntu:24.04
>>>>>>
>>>>>> CONTAINERCMD can be replaced by "podman" to build with podman.
>>>>>>
>>>>>
>>>>> I assume installing podman-docker package could help with that too. Not
>>>>> sure it's recommended, but I do have it installed :)
>>>>
>>>> Didn't know about that one. But it shouldn't be required to use podman or docker
>>>> independently I think?
>>>>
>>>
>>> It's just that when you run `which docker` you would get podman. But we
>>> shouldn't require the user to install podman-docker for the script to
>>> work. Just wanted to mention it.
>>>
>>> [...]
>>>
>>>>>> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
>>>>>> +CONTAINERCMD=${CONTAINERCMD:-docker}
>>>>>> +DOCS_DIR="$SCRIPT_DIR/../.."
>>>>>> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
>>>>>> +
>>>>>> +# This lists the different images we can build and the keys we pass to yq to
>>>>>> +# find packages in poky.yaml.in.
>>>>>> +# The keys should be in the form of "<image>:<version>", as this will be passed
>>>>>> +# to FROM in the selected Dockerfile below.
>>>>>> +# These are common yq keys used for multiple distros.
>>>>>> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
>>>>>> +declare -A YQ_KEYS=(
>>>>>> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>>>> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>>>> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>>>> +)
>>>>>> +
>>>>>> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
>>>>>> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
>>>>>> +# and the keys listed in DOCKERFILES below.
>>>>>> +declare -A DOCKERFILES=(
>>>>>> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
>>>>>> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
>>>>>> +  [debian:12]="Dockerfile.ubuntu-debian"
>>>>>> +)
>>>>>> +
>>>>>> +main ()
>>>>>> +{
>>>>>> +  if [ "$#" -lt 1 ]; then
>>>>>> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
>>>>>> +    exit 1
>>>>>> +  elif [ "$1" = "list" ]; then
>>>>>> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
>>>>>> +    exit 0
>>>>>> +  fi
>>>>>> +
>>>>>> +  local image="$1"
>>>>>> +  shift
>>>>>> +  local make_targets="${*:-publish}"
>>>>>> +
>>>>>> +  for cmd in $CONTAINERCMD yq; do
>>>>>> +    if ! which "$cmd" >/dev/null 2>&1; then
>>>>>> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
>>>>>> +      exit 1
>>>>>> +    fi
>>>>>> +  done
>>>>>> +
>>>>>> +  # Get the appropriate dockerfile from DOCKERFILES
>>>>>> +  dockerfile="${DOCKERFILES[$image]}"
>>>>>> +
>>>>>> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
>>>>>> +  echo -n > "$temporary_dep_file" # empty the file
>>>>>> +  for dep_key in ${YQ_KEYS[$image]}; do
>>>>>> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
>>>>>> +    # add a white space after last element of yq command
>>>>>> +    echo -n " " >> "$temporary_dep_file"
>>>>>> +  done
>>>>>> +
>>>>>
>>>>> Just use a temporary file, this would allow to run builds from two
>>>>> different distros at the same time. mktemp should help you with that.
>>>>
>>>> The problem is that the source in COPY must be part of the build context, so I
>>>> don't having a temp file in /tmp is possible. See
>>>> https://docs.docker.com/reference/dockerfile/#adding-files-from-the-build-context
>>>>
>>>> I could use "mktemp --tmpdir=$SCRIPT_DIR/dockerfiles" but I don't really see the
>>>> added value.
>>>>
>>>
>>> "this would allow to run builds from two different distros at the same time"
>>>
>>> Also, it's guaranteed that an old file wouldn't be used in case the
>>> script is somehow broken.
>> 
>> I see what you mean. However, since mktemp create a random name it creates a
>> new layer each time, invalidating the following commands in the container file.
>> 
>> Instead I propose to move this above:
>> 
>>>>>> +  # docker build doesn't accept 2 colons, so "sanitize" the name
>>>>>> +  local sanitized_dockername
>>>>>> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')
>> 
>> And do this:
>> 
>>    temporary_dep_file="$SCRIPT_DIR/dockerfiles/$sanitized_dockername.deps"
>> 
>> So we avoid the distro clash and we still use cached layers.
>> 
>>>>>> +
>>>>>> +  $CONTAINERCMD build \
>>>>>> +    --tag yocto-docs-$sanitized_dockername:latest \
>>>>>> +    --build-arg ARG_FROM="$image" \
>>>>>> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
>>>>>> +    "$SCRIPT_DIR/dockerfiles"
>>>>>> +
>>>>>> +  # We can remove the deps file, we no longer need it
>>>>>> +  rm -f "$temporary_dep_file"
>>>>>> +
>>>>>> +  local -a args_run=(
>>>>>> +    --rm
>>>>>> +    --interactive
>>>>>> +    --tty
>>>>>> +    --volume="$DOCS_DIR:/docs:rw"
>>>>>> +    --workdir=/docs
>>>>>> +    --security-opt label=disable
>>>>>> +  )
>>>>>> +
>>>>>> +  if [ "$CONTAINERCMD" = "docker" ]; then
>>>>>> +    args_run+=(
>>>>>> +      --user="$(id -u)":"$(id -g)"
>>>>>> +    )
>>>>>> +  elif [ "$CONTAINERCMD" = "podman" ]; then
>>>>>> +    # we need net access to fetch bitbake terms
>>>>>> +    args_run+=(
>>>>>> +      --cap-add=NET_RAW
>>>>>> +      --userns=keep-id
>>>>>> +    )
>>>>>> +  fi
>>>>>> +
>>>>>> +  $CONTAINERCMD run \
>>>>>> +    "${args_run[@]}" \
>>>>>> +    yocto-docs-$sanitized_dockername \
>>>>>> +    make -C documentation $make_targets
>>>>>> +}
>>>>>> +
>>>>>> +set -eu -o pipefail
>>>>>> +
>>>>>> +main "$@"
>>>>>> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>>>> new file mode 100644
>>>>>> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
>>>>>> --- /dev/null
>>>>>> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>>>> @@ -0,0 +1,18 @@
>>>>>> +ARG ARG_FROM=debian:12 # default value to avoid warning
>>>>>> +FROM $ARG_FROM
>>>>>> +
>>>>>> +ENV DEBIAN_FRONTEND=noninteractive
>>>>>> +
>>>>>> +# relative to the location of the dockerfile
>>>>>> +COPY deps /deps
>>>>>> +
>>>>>> +RUN apt-get update \
>>>>>> + && apt-get --yes --no-install-recommends install $(cat /deps) \
>>>>>> + && apt-get --yes autoremove \
>>>>>> + && apt-get clean \
>>>>>> + && rm /deps
>>>>>> +
>>>>>> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
>>>>>> +
>>>>>> +RUN git config --global --add safe.directory /docs
>>>>>> +
>>>>>>
>>>>>
>>>>> So... I've had another idea.
>>>>>
>>>>> Parsing yaml in shell, not fun.
>>>>>
>>>>> But I don't think we **need** the instructions to be in the YAML file.
>>>>>
>>>>> So I think we could drastically simplify all this by storing the setup
>>>>> instructions in shell scripts which are included in sphinx AND use those
>>>>> shell scripts in the Containerfile. We actually do this for our product
>>>>> user manuals :)
>>>>>
>>>>> Essentially, we would have:
>>>>>
>>>>> $ cat documentation/host-packages-ubuntu.sh
>>>>> sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]
>>>>>
>>>>> In Sphinx:
>>>>>
>>>>> .. literalinclude:: 50.literalinclude.apt.sh
>>>>>       :language: bash
>>>>>
>>>>> In the appropriate place.
>>>>>
>>>>> How we generate the Containerfile internally:
>>>>> """
>>>>> cat <<EOF > $WORKDIR/merged.sh
>>>>> #!/bin/bash
>>>>> set -eux
>>>>> export DEBIAN_FRONTEND=noninteractive
>>>>> apt-get update
>>>>> apt-get -y install sudo
>>>>> EOF
>>>>>
>>>>> # Merge bash snippets we want to run (in the right order)
>>>>> cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\
>>>>> debos-build-bookworm}.sh >> $WORKDIR/merged.sh
>>>>>
>>>>> chmod +x $WORKDIR/merged.sh
>>>>>
>>>>> podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt
>>>>> label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
>>>>> """
>>>>>
>>>>> I assume we could simply use COPY in the Containerfile and include the
>>>>> shell script in there and run the shell script in a single RUN. You
>>>>> could then do the cleanup step in a separate layer, not optimal since
>>>>> the layer will be unnecessarily big but I assume the end image is going
>>>>> to be smaller anyway?
>>>>>
>>>>> The benefit is that you test **exactly** what's in the documentation.
>>>>> And also you don't have to parse YAML. And it should be relatively easy
>>>>> to add support for new distros.
>>>>>
>>>>> What do you think?
>>>>
>>>> I like this idea! Only downside I see is to have to re-run "apt get install" for
>>>> each run command, which is not really ideal IMO. I prefer caching what's already
>>>
>>> No? I have one RUN command which is "execute this shell script".
>>> Whatever is executed as part of this shell script is part of one layer.
>>> You could even have one RUN command which is apt-get update &&
>>> my-shell-script && apt-get cache remove whatever to make this a smaller
>>> layer.
>> 
>> So let me know if I got this wrong, but running "podman run ..." multiple times
>> is going to make use of a cached layer automagically? I.e. merged.sh is not
>> re-run each time? Or you are implying that there is still a Containerfile in the
>> process that caches this data?
>> 
>
> OK, so I actually didn't write what I was thinking about :)
>
> To answer your question, podman run cannot cache what's running in there 
> with the above command, so it's run every time. This is fine for our 
> manual because we do want to run those commands every time and not rely 
> on caching (e.g. because we build the HEAD of some git repo and that 
> **is** desired).
>
> But we could simply generate this merged.sh script before podman build 
> and pass it inside the Containerfile so it makes it as a layer.
>
> Obviously, any change to the shell script would trigger the COPY layer 
> to change and everything after be rebuilt. So the cache will work until 
> the shell script content (mtime actually I believe?) changes.

Okay, we're on the same page, thanks for clarifying!

>>>> been downloaded so we can re-build the doc swiftly (I imagined this script to be
>>>> used for more than verifying that what we list in the dep list is enough to
>>>> build, and used by anyone to test their changes to the docs easily).
>>>>
>>>
>>> You can have as many scripts as you want? And have them do as many or
>>> little things as you want, I'm not sure to follow what you have in mind
>>> here?
>> 
>> That wasn't clear maybe. I simply meant to say that I want to publish this
>> script not only for verifying that listed dependencies are correct, but also
>> because it's convenient for anyone to build the documentation. It's unrelated to
>> your solution.
>> 
>
> I'm working on something right now, will send as RFC so you can comment 
> on it. Should happen today or tomorrow.

Nice! Are you basing things on top of my patch? Or is it separate? I got a v3
ready in any case.

> BTW, I'm unable to build the docs on openSUSE Leap 15.4/15.5/15.6. It 
> seems it's always trying to build from python 3.6 and something breaks 
> in the sphinx part very early:
>
> 0efb6adca777:/docs # make -C documentation/ html
> make: Entering directory '/docs/documentation'
> ./set_versions.py
> Traceback (most recent call last):
>    File "./set_versions.py", line 102, in <module>
>      subprocess.run(["git", "show", "yocto-%s" % 
> release_series[activereleases[0]]], capture_output=True, check=True)
>    File "/usr/lib64/python3.6/subprocess.py", line 423, in run
>      with Popen(*popenargs, **kwargs) as process:
> TypeError: __init__() got an unexpected keyword argument 'capture_output'
> make: *** [Makefile:78: html] Error 1
> make: Leaving directory '/docs/documentation'

As expected, documentation builds aren't really working on all distros :/ Thanks
for reporting this. Hopefully the scripts will help with that.


Antonin
diff mbox series

Patch

diff --git a/documentation/README b/documentation/README
index 8a47fd4a3fd07d41d61a7d681d82bd13ac74527d..aebece13758005a8b913b4570fee6118256f6f9b 100644
--- a/documentation/README
+++ b/documentation/README
@@ -128,6 +128,33 @@  dependencies in a virtual environment:
  $ pipenv install
  $ pipenv run make html
 
+Building the documentation in a container
+=========================================
+
+The documentation can be built in a container with the build-docs-container
+scripts in the tools/ directory. The default container command used by the
+script is docker, but the script also supports podman by setting the
+CONTAINERCMD variable in your environment:
+
+  $ export CONTAINERCMD=podman
+
+A basic usage of the script would be:
+
+  $ ./tools/build-docs-container debian:12
+
+This will the entire documentation in an Debian 12 container.
+
+The documentation can be built in other distributions. This list can be obtained
+by running:
+
+  $ ./tools/build-docs-container list
+
+The make target can optionally be provided to build a single documentation type
+(these targets are listed in the documentation Makefile). For example, to build
+the documentation in HTML and EPub formats, the following command can be used:
+
+  $ ./tools/build-docs-container ubuntu:24.04 epub html
+
 Style checking the Yocto Project documentation
 ==============================================
 
diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container
new file mode 100755
index 0000000000000000000000000000000000000000..b9578d6da49b707c4de215866b6396cfeff3d08b
--- /dev/null
+++ b/documentation/tools/build-docs-container
@@ -0,0 +1,119 @@ 
+#!/usr/bin/env bash
+#
+# Build a container ready to build the documentation be reading the dependencies
+# listed in poky.yaml.in, and start a documentation build in this container.
+#
+# Usage:
+#
+#   ./documentation/tools/build-docs-container <image> [<make target>]
+#
+# Where <image> is one of the keys in YQ_KEYS. E.g.:
+#
+#   ./documentation/tools/build-docs-container ubuntu:24.04 html
+#
+# Will build the docs in an Ubuntu 24.04 container in html.
+#
+# The container commands can be used by exporting CONTAINERCMD in the
+# environment. The default is docker, but podman can also be used.
+
+SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
+CONTAINERCMD=${CONTAINERCMD:-docker}
+DOCS_DIR="$SCRIPT_DIR/../.."
+POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
+
+# This lists the different images we can build and the keys we pass to yq to
+# find packages in poky.yaml.in.
+# The keys should be in the form of "<image>:<version>", as this will be passed
+# to FROM in the selected Dockerfile below.
+# These are common yq keys used for multiple distros.
+_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
+declare -A YQ_KEYS=(
+  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
+  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
+  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
+)
+
+# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
+# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
+# and the keys listed in DOCKERFILES below.
+declare -A DOCKERFILES=(
+  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
+  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
+  [debian:12]="Dockerfile.ubuntu-debian"
+)
+
+main ()
+{
+  if [ "$#" -lt 1 ]; then
+    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
+    exit 1
+  elif [ "$1" = "list" ]; then
+    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
+    exit 0
+  fi
+
+  local image="$1"
+  shift
+  local make_targets="${*:-publish}"
+
+  for cmd in $CONTAINERCMD yq; do
+    if ! which "$cmd" >/dev/null 2>&1; then
+      echo "The $cmd command was not found. Make sure you have $cmd installed."
+      exit 1
+    fi
+  done
+
+  # Get the appropriate dockerfile from DOCKERFILES
+  dockerfile="${DOCKERFILES[$image]}"
+
+  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
+  echo -n > "$temporary_dep_file" # empty the file
+  for dep_key in ${YQ_KEYS[$image]}; do
+    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
+    # add a white space after last element of yq command
+    echo -n " " >> "$temporary_dep_file"
+  done
+
+  # docker build doesn't accept 2 colons, so "sanitize" the name
+  local sanitized_dockername
+  sanitized_dockername=$(echo "$image" | tr ':.' '-')
+
+  $CONTAINERCMD build \
+    --tag yocto-docs-$sanitized_dockername:latest \
+    --build-arg ARG_FROM="$image" \
+    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
+    "$SCRIPT_DIR/dockerfiles"
+
+  # We can remove the deps file, we no longer need it
+  rm -f "$temporary_dep_file"
+
+  local -a args_run=(
+    --rm
+    --interactive
+    --tty
+    --volume="$DOCS_DIR:/docs:rw"
+    --workdir=/docs
+    --security-opt label=disable
+  )
+
+  if [ "$CONTAINERCMD" = "docker" ]; then
+    args_run+=(
+      --user="$(id -u)":"$(id -g)"
+    )
+  elif [ "$CONTAINERCMD" = "podman" ]; then
+    # we need net access to fetch bitbake terms
+    args_run+=(
+      --cap-add=NET_RAW
+      --userns=keep-id
+    )
+  fi
+
+  $CONTAINERCMD run \
+    "${args_run[@]}" \
+    yocto-docs-$sanitized_dockername \
+    make -C documentation $make_targets
+}
+
+set -eu -o pipefail
+
+main "$@"
diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
new file mode 100644
index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
--- /dev/null
+++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
@@ -0,0 +1,18 @@ 
+ARG ARG_FROM=debian:12 # default value to avoid warning
+FROM $ARG_FROM
+
+ENV DEBIAN_FRONTEND=noninteractive
+
+# relative to the location of the dockerfile
+COPY deps /deps
+
+RUN apt-get update \
+ && apt-get --yes --no-install-recommends install $(cat /deps) \
+ && apt-get --yes autoremove \
+ && apt-get clean \
+ && rm /deps
+
+RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
+
+RUN git config --global --add safe.directory /docs
+