diff mbox series

[v2,3/3] tools/build-docs-container: build in separate directory for each distro

Message ID 20251009-iid-file-v2-3-715d527457f0@cherry.de
State Under Review
Headers show
Series tools/build-docs-container: improve concurrent safety | expand

Commit Message

Quentin Schulz Oct. 9, 2025, 10:24 a.m. UTC
From: Quentin Schulz <quentin.schulz@cherry.de>

This allows to compare build output from different distros if necessary
and make it easier to make sure we start building from a clean slate.

Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
---
 documentation/tools/build-docs-container | 2 ++
 1 file changed, 2 insertions(+)

Comments

Antonin Godard Oct. 14, 2025, 9:27 a.m. UTC | #1
On Thu Oct 9, 2025 at 12:24 PM CEST, Quentin Schulz wrote:
> From: Quentin Schulz <quentin.schulz@cherry.de>
>
> This allows to compare build output from different distros if necessary
> and make it easier to make sure we start building from a clean slate.
>
> Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
> ---
>  documentation/tools/build-docs-container | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container
> index 831d357fb..47320e182 100755
> --- a/documentation/tools/build-docs-container
> +++ b/documentation/tools/build-docs-container
> @@ -60,6 +60,7 @@ main ()
>    fi
>  
>    local image="$1"
> +  local orig_image=$image
>    shift
>  
>    OCI=$(which "$CONTAINERCMD")
> @@ -159,6 +160,7 @@ main ()
>      --volume="$DOCS_DIR:/docs:rw"
>      --workdir=/docs
>      --security-opt label=disable
> +    --env BUILDDIR="_build-$orig_image-$image_sha"

I think the idea is good for improving concurrency, however I think for
development it would be nice to keep _build the default. What about having
the following default behavior:

- Build in "_build-$orig_image-$image_sha"
- If "_build" does not exist _or_ is a symlink, create or replace a symlink to
  "_build-$orig_image-$image_sha" from the previous step, named "_build"

    _build -> _build-$orig_image-$image_sha"

- If "_build" exists and is a directory, leave it untouched as it may be
  originating from another "regular" build.

Then:

- `make clean` removes "_build" and "_build-$orig_image-$image_sha" (when
  "_build-$orig_image-$image_sha" exists and _build is a symlink to it)

- We could maybe also have a "cleanall" (or name alike) that removes all the
  _build* it finds, to clean everything up.

With this, additional runs with different SHAs will just stack next to this,
re-creating the symlink only when it is not a directory. This way, I know that
for any previous run of build-docs-container, I can open the "_build" symlink
to get the latest output. Of course I need to be careful when running concurrent
builds, but that is another use-case, a bit different from day-to-day
development IMO.

What do you think?

Antonin
Quentin Schulz Oct. 15, 2025, 11:37 a.m. UTC | #2
Hi Antonin,

On 10/14/25 11:27 AM, Antonin Godard wrote:
> On Thu Oct 9, 2025 at 12:24 PM CEST, Quentin Schulz wrote:
>> From: Quentin Schulz <quentin.schulz@cherry.de>
>>
>> This allows to compare build output from different distros if necessary
>> and make it easier to make sure we start building from a clean slate.
>>
>> Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
>> ---
>>   documentation/tools/build-docs-container | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container
>> index 831d357fb..47320e182 100755
>> --- a/documentation/tools/build-docs-container
>> +++ b/documentation/tools/build-docs-container
>> @@ -60,6 +60,7 @@ main ()
>>     fi
>>   
>>     local image="$1"
>> +  local orig_image=$image
>>     shift
>>   
>>     OCI=$(which "$CONTAINERCMD")
>> @@ -159,6 +160,7 @@ main ()
>>       --volume="$DOCS_DIR:/docs:rw"
>>       --workdir=/docs
>>       --security-opt label=disable
>> +    --env BUILDDIR="_build-$orig_image-$image_sha"
> 
> I think the idea is good for improving concurrency, however I think for
> development it would be nice to keep _build the default. What about having
> the following default behavior:
> 
> - Build in "_build-$orig_image-$image_sha"
> - If "_build" does not exist _or_ is a symlink, create or replace a symlink to
>    "_build-$orig_image-$image_sha" from the previous step, named "_build"
> 
>      _build -> _build-$orig_image-$image_sha"
> 
> - If "_build" exists and is a directory, leave it untouched as it may be
>    originating from another "regular" build.
> 
> Then:
> 
> - `make clean` removes "_build" and "_build-$orig_image-$image_sha" (when
>    "_build-$orig_image-$image_sha" exists and _build is a symlink to it)
> 
> - We could maybe also have a "cleanall" (or name alike) that removes all the
>    _build* it finds, to clean everything up.
> 

To make sure the make clean still removes the old _build directories, I 
think we could simply build in a subdirectory of _build, e.g.

BUILDDIR=_build/$orig_image-$image_sha

then make clean should work just fine.

> With this, additional runs with different SHAs will just stack next to this,
> re-creating the symlink only when it is not a directory. This way, I know that
> for any previous run of build-docs-container, I can open the "_build" symlink
> to get the latest output. Of course I need to be careful when running concurrent
> builds, but that is another use-case, a bit different from day-to-day
> development IMO.
> 

Mmmmm...

So I think we could make this concurrent-safe by using an flock on some 
file?

e.g. surround any Make target that modifies the source directory with an 
flock, something like shown here: https://unix.stackexchange.com/a/382916

And actually... I think we already have some issue with concurrency 
without calling make multiple times from the same tree.

I believe when you run make all, make will run make html, make epub and 
make latexpdf in parallel which will all call set_versions.py which 
modifies the source tree. Same would happen if passing multiple targets 
to make (e.g. make latexpdf html). I don't think the f.write() and 
w.write() in set_versions.py are concurrent-safe.

> What do you think?
> 

With the flock above, we could simply add the creation of a 
_build/latest symlink which points to the last make target that ran. 
Maybe we want a _build/latest/html for the last make html target, 
otherwise imagine you run make all once and then make html one more time 
for another distro, then _build/latest will now point to _build/distroB/ 
which didn't generate the html part). But that could also mean we should 
split the build directories for latexpdf, html and epub so this works... 
but that means not being able to reuse the sphinx "cache" (I believe 
there's one to do incremental builds? maybe it's also used for different 
output types?).

I think we (I) may be overengineering this a tiny bit :)

I think we probably should add an flock at least for the writing of the 
files by set_versions.py if that proves to be an issue (testing needs to 
be done :) )

Cheers,
Quentin
diff mbox series

Patch

diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container
index 831d357fb..47320e182 100755
--- a/documentation/tools/build-docs-container
+++ b/documentation/tools/build-docs-container
@@ -60,6 +60,7 @@  main ()
   fi
 
   local image="$1"
+  local orig_image=$image
   shift
 
   OCI=$(which "$CONTAINERCMD")
@@ -159,6 +160,7 @@  main ()
     --volume="$DOCS_DIR:/docs:rw"
     --workdir=/docs
     --security-opt label=disable
+    --env BUILDDIR="_build-$orig_image-$image_sha"
   )
 
   if [ "$OCI" = "docker" ]; then