diff mbox series

initscripts: make populate-volatile.sh power-cut-save

Message ID 20240827085528.53810-1-viacheslav.volkov.1@gmail.com
State New
Headers show
Series initscripts: make populate-volatile.sh power-cut-save | expand

Commit Message

Viacheslav Volkov Aug. 27, 2024, 8:55 a.m. UTC
Atomically create files/folders with proper ownership and permissions
in a way that unclean reboots could not lead to any corruption or
inconsistency. See also:

http://www.linux-mtd.infradead.org/faq/ubifs.html#L_atomic_change

This guarantees that in case of a sudden power-cut we either don't have
a destination file/folder at all (in this case it will be handled by
next populate-volatile.sh execution) or have it with correct ownership
and permissions.

Note: we can't use ${TMPROOT} for storing temporary files/folders
because final mv command is guaranteed to be atomic only if both source
and destination are located within the same filesystem.

Other changes:

- Change owner:group for symlinks as requested.

- Wrap in double quotes all values which could have spaces.

- Propagate proper exit code from eval script. This might be useful in
  future e.g. to print a nice error message or propagate an error code
  further down to script caller.
  clean_temp() is executed all the time (on both success and failure)
  but doesn't affect exit code of the eval script (in case of
  chown/chmod failure the exit code will be propagated).

- mk_dir(): never silently create a parent folder if it doesn't exist
  because parent folders in this case might have undesired
  ownership/permissions. For malformed configs like:

        d root root 0750 /grand_parent/parent/me/child none
        d user1 group1 0700 /grand_parent/parent none
        d user2 group2 0777 /grand_parent none

  while processing a first line we'd better not create
  /grand_parent, /grand_parent/parent, /grand_parent/parent/me with
  wrong (default) ownership and permissions. Instead we'd better fail
  to create them at all (it will likely be noticed and fixed).

Signed-off-by: Viacheslav Volkov <viacheslav.volkov.1@gmail.com>
---
 .../initscripts-1.0/populate-volatile.sh      | 76 +++++++++++++++----
 1 file changed, 61 insertions(+), 15 deletions(-)

Comments

Ross Burton Sept. 12, 2024, 3:46 p.m. UTC | #1
> On 27 Aug 2024, at 09:55, Viacheslav Volkov via lists.openembedded.org <viacheslav.volkov.1=gmail.com@lists.openembedded.org> wrote:
> +if [ -z "$ROOT_DIR" ]; then
> + SYNC_CMD="sync" # on target run sync
> +else
> + # At rootfs time sync is not required. Moreover sync symlink is not
> + # present in ${TMPDIR}/hosttools directory while building rootfs, hence
> + # attempting to execute sync would cause a silent error (further
> + # commands won't be executed).
> + SYNC_CMD="true"
> +fi

Doing a sync for _every_ file created seems like overkill, and if you’re doing atomic creation is it even useful? The correct file will exist or not exist: what does a sync add?

(also: it’s power-cut safe, not save)

Thanks,
Ross
Viacheslav Volkov Sept. 13, 2024, 6:02 a.m. UTC | #2
Hi Ross,

> Doing a sync for _every_ file created seems like overkill, and if you're
> doing atomic creation is it even useful? The correct file will exist or not
> exist: what does a sync add?

sync is an important part of atomic file creation. It acts as a barrier and
prevents possible reordering on file system level. To demonstrate an issue
without sync I did the following experiment on my board with NAND flash
formatted as UBIFS (simulating populate-volatile.sh behavior):

ssh board "
  # some_file* don't exist at this moment.
  touch some_file.populate-volatile.tmp # created with default permissions: 644
  chmod 600 some_file.populate-volatile.tmp # update permissions
  # Note: there is no sync at this moment.
  mv some_file.populate-volatile.tmp some_file
  ls -l some_file* # Make sure we've updated permissions. Output:
  # -rw-------    1 root     root             0 Sep 13 08:36 some_file
"
sleep 5 # Give a board some time to write data to flash
        # (otherwise some_file* is not preserved at all in my case).
power_off_board.sh # Physically cut board's power.
sleep 15 # Keep the board off for some time..
power_on_board.sh # Turn board's power on, it starts booting automatically.
sleep 40 # Give a board time to boot
ssh board "
  ls -l some_file* # Output:
  # -rw-r--r--    1 root     root             0 Sep 13 08:36 some_file
"

Above we can observe some_file with wrong (default) permissions. Effect of
chmod has been lost due to power cut. Furthermore since some_file now exists,
it won't be processed by next populate-volatile.sh execution (after reboot) and
hence remains with wrong permissions forever. A behavior with ownership is the
same as with permissions in my case.

Additionally I'd like to quote UBIFS FAQ here (I've provided a link to it in
my patch):

> Changing a file atomically means changing its contents in a way that unclean
> reboots could not lead to any corruption or inconsistency in the file. The
> only reliable way to do this in UBIFS (and in most of other file-systems,
> e.g. JFFS2 or ext3) is the following:
>
> * make a copy of the file;
> * change the copy;
> * synchronize the copy;
> * re-name the copy to the file (using the rename() libc function or the mv
>   utility).
>
> Note, if a power-cut happens during the re-naming, the original file will be
> intact because the re-name operation is atomic. This is a POSIX requirement
> and UBIFS satisfies it.
>
> Often applications do not do the third step - synchronizing the copy.
> Although this is generally an application bug, the ext4 file-system has a
> hack which makes sure the data of the copy hits the disk before the re-name
> meta-data, which "fixes" buggy applications. However, UBIFS does not have
> this feature, although we plan to implement it.

I understand that sync implies some performance degradation but IMHO we should
prefer correctness over performance by default. I did some basic performance
experiments: time /etc/init.d/populate-volatile.sh
In my case I didn't notice any measurable difference in execution time with and
without sync command in populate-volatile.sh (22 files were created, which is
not that much, I agree..).

> (also: it's power-cut safe, not save)

Agree, it'd be better to fix it. I'll wait for further feedback before sending
a v2 patch with this fix..

Regards,
Viacheslav
diff mbox series

Patch

diff --git a/meta/recipes-core/initscripts/initscripts-1.0/populate-volatile.sh b/meta/recipes-core/initscripts/initscripts-1.0/populate-volatile.sh
index bc630e871c..c7b95e0540 100755
--- a/meta/recipes-core/initscripts/initscripts-1.0/populate-volatile.sh
+++ b/meta/recipes-core/initscripts/initscripts-1.0/populate-volatile.sh
@@ -25,24 +25,45 @@  ROOT_DIR="$(echo "$DIRNAME" | sed -ne 's:/etc/.*::p')"
 CFGDIR="${ROOT_DIR}/etc/default/volatiles"
 TMPROOT="${ROOT_DIR}/var/volatile/tmp"
 COREDEF="00_core"
+SUFFIX=".populate-volatile.tmp"
+if [ -z "$ROOT_DIR" ]; then
+	SYNC_CMD="sync" # on target run sync
+else
+	# At rootfs time sync is not required. Moreover sync symlink is not
+	# present in ${TMPDIR}/hosttools directory while building rootfs, hence
+	# attempting to execute sync would cause a silent error (further
+	# commands won't be executed).
+	SYNC_CMD="true"
+fi
 
 [ "${VERBOSE}" != "no" ] && echo "Populating volatile Filesystems."
 
 create_file() {
-	EXEC=""
+	EXEC="(
+	clean_temp()
+	{
+		rm -rf \"${1}${SUFFIX}\"
+	}
+	trap clean_temp EXIT
+	clean_temp&&
+	"
 	if [ -z "$2" ]; then
 		EXEC="
-		touch \"$1\";
+		${EXEC}
+		touch \"${1}${SUFFIX}\"&&
 		"
 	else
 		EXEC="
-		cp \"$2\" \"$1\";
+		${EXEC}
+		cp \"$2\" \"${1}${SUFFIX}\"&&
 		"
 	fi
 	EXEC="
 	${EXEC}
-	chown ${TUSER}:${TGROUP} $1 || echo \"Failed to set owner -${TUSER}- for -$1-.\";
-	chmod ${TMODE} $1 || echo \"Failed to set mode -${TMODE}- for -$1-.\" "
+	chown \"${TUSER}:${TGROUP}\" \"${1}${SUFFIX}\"&&
+	chmod \"${TMODE}\" \"${1}${SUFFIX}\"&&
+	$SYNC_CMD \"${1}${SUFFIX}\"&&
+	mv \"${1}${SUFFIX}\" \"$1\")"
 
 	test "$VOLATILE_ENABLE_CACHE" = yes && echo "$EXEC" >> /etc/volatile.cache.build
 
@@ -62,10 +83,18 @@  create_file() {
 }
 
 mk_dir() {
-	EXEC="
-	mkdir -p \"$1\";
-	chown ${TUSER}:${TGROUP} $1 || echo \"Failed to set owner -${TUSER}- for -$1-.\";
-	chmod ${TMODE} $1 || echo \"Failed to set mode -${TMODE}- for -$1-.\" "
+	EXEC="(
+	clean_temp()
+	{
+		rm -rf \"${1}${SUFFIX}\"
+	}
+	trap clean_temp EXIT
+	clean_temp&&
+	mkdir \"${1}${SUFFIX}\"&&
+	chown \"${TUSER}:${TGROUP}\" \"${1}${SUFFIX}\"&&
+	chmod \"${TMODE}\" \"${1}${SUFFIX}\"&&
+	$SYNC_CMD \"${1}${SUFFIX}\"&&
+	mv \"${1}${SUFFIX}\" \"$1\")"
 
 	test "$VOLATILE_ENABLE_CACHE" = yes && echo "$EXEC" >> /etc/volatile.cache.build
 	if [ -e "$1" ]; then
@@ -82,20 +111,37 @@  mk_dir() {
 }
 
 link_file() {
-	EXEC="
+	EXEC="(
+	clean_temp()
+	{
+		rm -rf \"${2}${SUFFIX}\"
+	}
+	create_symlink()
+	{
+		ln -sf \"$1\" \"${2}${SUFFIX}\"&&
+		chown -h \"${TUSER}:${TGROUP}\" \"${2}${SUFFIX}\"&&
+		$SYNC_CMD \"${2}${SUFFIX}\"&&
+		mv \"${2}${SUFFIX}\" \"$2\"
+	}
+	trap clean_temp EXIT
+	clean_temp&&
 	if [ -L \"$2\" ]; then
-		[ \"\$(readlink \"$2\")\" != \"$1\" ] && { rm -f \"$2\"; ln -sf \"$1\" \"$2\"; };
+		if [ \"\$(readlink \"$2\")\" != \"$1\" ]; then
+			rm -f \"$2\"&&
+			create_symlink
+		fi
 	elif [ -d \"$2\" ]; then
 		if awk '\$2 == \"$2\" {exit 1}' /proc/mounts; then
 			cp -a $2/* $1 2>/dev/null;
 			cp -a $2/.[!.]* $1 2>/dev/null;
-			rm -rf \"$2\";
-			ln -sf \"$1\" \"$2\";
+			$SYNC_CMD&&
+			rm -rf \"$2\"&&
+			create_symlink
 		fi
 	else
-		ln -sf \"$1\" \"$2\";
+		create_symlink
 	fi
-        "
+	)"
 
 	test "$VOLATILE_ENABLE_CACHE" = yes && echo "	$EXEC" >> /etc/volatile.cache.build