| Message ID | 20251009-iid-file-v2-3-715d527457f0@cherry.de |
|---|---|
| State | Under Review |
| Headers | show |
| Series | tools/build-docs-container: improve concurrent safety | expand |
On Thu Oct 9, 2025 at 12:24 PM CEST, Quentin Schulz wrote: > From: Quentin Schulz <quentin.schulz@cherry.de> > > This allows to compare build output from different distros if necessary > and make it easier to make sure we start building from a clean slate. > > Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de> > --- > documentation/tools/build-docs-container | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container > index 831d357fb..47320e182 100755 > --- a/documentation/tools/build-docs-container > +++ b/documentation/tools/build-docs-container > @@ -60,6 +60,7 @@ main () > fi > > local image="$1" > + local orig_image=$image > shift > > OCI=$(which "$CONTAINERCMD") > @@ -159,6 +160,7 @@ main () > --volume="$DOCS_DIR:/docs:rw" > --workdir=/docs > --security-opt label=disable > + --env BUILDDIR="_build-$orig_image-$image_sha" I think the idea is good for improving concurrency, however I think for development it would be nice to keep _build the default. What about having the following default behavior: - Build in "_build-$orig_image-$image_sha" - If "_build" does not exist _or_ is a symlink, create or replace a symlink to "_build-$orig_image-$image_sha" from the previous step, named "_build" _build -> _build-$orig_image-$image_sha" - If "_build" exists and is a directory, leave it untouched as it may be originating from another "regular" build. Then: - `make clean` removes "_build" and "_build-$orig_image-$image_sha" (when "_build-$orig_image-$image_sha" exists and _build is a symlink to it) - We could maybe also have a "cleanall" (or name alike) that removes all the _build* it finds, to clean everything up. With this, additional runs with different SHAs will just stack next to this, re-creating the symlink only when it is not a directory. This way, I know that for any previous run of build-docs-container, I can open the "_build" symlink to get the latest output. Of course I need to be careful when running concurrent builds, but that is another use-case, a bit different from day-to-day development IMO. What do you think? Antonin
Hi Antonin, On 10/14/25 11:27 AM, Antonin Godard wrote: > On Thu Oct 9, 2025 at 12:24 PM CEST, Quentin Schulz wrote: >> From: Quentin Schulz <quentin.schulz@cherry.de> >> >> This allows to compare build output from different distros if necessary >> and make it easier to make sure we start building from a clean slate. >> >> Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de> >> --- >> documentation/tools/build-docs-container | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container >> index 831d357fb..47320e182 100755 >> --- a/documentation/tools/build-docs-container >> +++ b/documentation/tools/build-docs-container >> @@ -60,6 +60,7 @@ main () >> fi >> >> local image="$1" >> + local orig_image=$image >> shift >> >> OCI=$(which "$CONTAINERCMD") >> @@ -159,6 +160,7 @@ main () >> --volume="$DOCS_DIR:/docs:rw" >> --workdir=/docs >> --security-opt label=disable >> + --env BUILDDIR="_build-$orig_image-$image_sha" > > I think the idea is good for improving concurrency, however I think for > development it would be nice to keep _build the default. What about having > the following default behavior: > > - Build in "_build-$orig_image-$image_sha" > - If "_build" does not exist _or_ is a symlink, create or replace a symlink to > "_build-$orig_image-$image_sha" from the previous step, named "_build" > > _build -> _build-$orig_image-$image_sha" > > - If "_build" exists and is a directory, leave it untouched as it may be > originating from another "regular" build. > > Then: > > - `make clean` removes "_build" and "_build-$orig_image-$image_sha" (when > "_build-$orig_image-$image_sha" exists and _build is a symlink to it) > > - We could maybe also have a "cleanall" (or name alike) that removes all the > _build* it finds, to clean everything up. > To make sure the make clean still removes the old _build directories, I think we could simply build in a subdirectory of _build, e.g. BUILDDIR=_build/$orig_image-$image_sha then make clean should work just fine. > With this, additional runs with different SHAs will just stack next to this, > re-creating the symlink only when it is not a directory. This way, I know that > for any previous run of build-docs-container, I can open the "_build" symlink > to get the latest output. Of course I need to be careful when running concurrent > builds, but that is another use-case, a bit different from day-to-day > development IMO. > Mmmmm... So I think we could make this concurrent-safe by using an flock on some file? e.g. surround any Make target that modifies the source directory with an flock, something like shown here: https://unix.stackexchange.com/a/382916 And actually... I think we already have some issue with concurrency without calling make multiple times from the same tree. I believe when you run make all, make will run make html, make epub and make latexpdf in parallel which will all call set_versions.py which modifies the source tree. Same would happen if passing multiple targets to make (e.g. make latexpdf html). I don't think the f.write() and w.write() in set_versions.py are concurrent-safe. > What do you think? > With the flock above, we could simply add the creation of a _build/latest symlink which points to the last make target that ran. Maybe we want a _build/latest/html for the last make html target, otherwise imagine you run make all once and then make html one more time for another distro, then _build/latest will now point to _build/distroB/ which didn't generate the html part). But that could also mean we should split the build directories for latexpdf, html and epub so this works... but that means not being able to reuse the sphinx "cache" (I believe there's one to do incremental builds? maybe it's also used for different output types?). I think we (I) may be overengineering this a tiny bit :) I think we probably should add an flock at least for the writing of the files by set_versions.py if that proves to be an issue (testing needs to be done :) ) Cheers, Quentin
diff --git a/documentation/tools/build-docs-container b/documentation/tools/build-docs-container index 831d357fb..47320e182 100755 --- a/documentation/tools/build-docs-container +++ b/documentation/tools/build-docs-container @@ -60,6 +60,7 @@ main () fi local image="$1" + local orig_image=$image shift OCI=$(which "$CONTAINERCMD") @@ -159,6 +160,7 @@ main () --volume="$DOCS_DIR:/docs:rw" --workdir=/docs --security-opt label=disable + --env BUILDDIR="_build-$orig_image-$image_sha" ) if [ "$OCI" = "docker" ]; then