diff mbox series

[pseudo,2/2] pseudo_db: fix duplicated xattr in rename

Message ID 20251111135951.2121065-2-hongxu.jia@windriver.com
State New
Headers show
Series [pseudo,1/2] pseudo_ipc.h: make 8 byte alignment for structure pseudo_msg_t | expand

Commit Message

Hongxu Jia Nov. 11, 2025, 1:59 p.m. UTC
In commit [fbaffe9 Handle rename(3) across devices.], it defines function
pdb_update_inode to change dev/inode for a given path -- used only by
RENAME for now.

In commit [58e4171 xattr work: handle deletes and renames], in function
pdb_update_inode, it defines oldmsg to test the existence of msg->path.
If yes, copy xattrs from oldmsg to msg. If both of oldmsg and msg have
the same dev/inode, it caused duplicated xattrs in the given path.
Here are the reproducible steps:

1) Create a file in rootfs
mkdir -p rootfs; touch rootfs/file

2) Use IMA/EVM signing utility to set xattr
evmctl ima_sign --hashalgo sha256 --key path-to/ima_keys/x509_ima.key --pass=xxx ./rootfs/file

3) List xattr, get security.ima
$ getfattr -d -m ^ -R -- rootfs/
- rootfs/
|# file: rootfs//file
security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw==

4) Rename
$ mv rootfs/file rootfs/rename_file

5) List xattr, found duplicated security.ima
$ getfattr -d -m ^ -R -- rootfs/
- rootfs/
|# file: rootfs//rename_file
security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw==
security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw==

This commit:

- Due to oldmsg is static variable, clean up the memory of oldmsg
  to avoid dirty data from last call.

- Other than copy the whole msg to oldmsg, only copy path and
  pathlen from msg to oldmsg, use function pdb_find_file_path
  to fill oldmsg->dev and oldmsg->ino if it existed

- Copy xattr only if the existed oldmsg have different dev/inode
  with msg to avoid duplicated xattr for the same dev/indoe

Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
---
 pseudo_db.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Mark Hatle Nov. 12, 2025, 5:58 p.m. UTC | #1
I don't completely understand what is going wrong here.  Some comments below to the patch that might help me understand it.

On 11/11/25 7:59 AM, Hongxu Jia wrote:
> In commit [fbaffe9 Handle rename(3) across devices.], it defines function
> pdb_update_inode to change dev/inode for a given path -- used only by
> RENAME for now.
> 
> In commit [58e4171 xattr work: handle deletes and renames], in function
> pdb_update_inode, it defines oldmsg to test the existence of msg->path.
> If yes, copy xattrs from oldmsg to msg. If both of oldmsg and msg have
> the same dev/inode, it caused duplicated xattrs in the given path.
> Here are the reproducible steps:
> 
> 1) Create a file in rootfs
> mkdir -p rootfs; touch rootfs/file
> 
> 2) Use IMA/EVM signing utility to set xattr
> evmctl ima_sign --hashalgo sha256 --key path-to/ima_keys/x509_ima.key --pass=xxx ./rootfs/file
> 
> 3) List xattr, get security.ima
> $ getfattr -d -m ^ -R -- rootfs/
> - rootfs/
> |# file: rootfs//file
> security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw==
> 
> 4) Rename
> $ mv rootfs/file rootfs/rename_file
> 
> 5) List xattr, found duplicated security.ima
> $ getfattr -d -m ^ -R -- rootfs/
> - rootfs/
> |# file: rootfs//rename_file
> security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw==
> security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw==
> 
> This commit:
> 
> - Due to oldmsg is static variable, clean up the memory of oldmsg
>    to avoid dirty data from last call.

See below, but the memcpy of the new msg should both setup the oldmsg and clear any old data.  It was expected that subsequent calls would clear any specific data we didn't want preserved (for whatever reason) -- See below.

> - Other than copy the whole msg to oldmsg, only copy path and
>    pathlen from msg to oldmsg, use function pdb_find_file_path
>    to fill oldmsg->dev and oldmsg->ino if it existed
> 
> - Copy xattr only if the existed oldmsg have different dev/inode
>    with msg to avoid duplicated xattr for the same dev/indoe
> 
> Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
> ---
>   pseudo_db.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/pseudo_db.c b/pseudo_db.c
> index 8b23938..d6ae1d3 100644
> --- a/pseudo_db.c
> +++ b/pseudo_db.c
> @@ -2188,10 +2188,13 @@ pdb_update_inode(pseudo_msg_t *msg) {
>   		pseudo_diag("Can't update the inode of a file without its path.\n");
>   		return 1;
>   	}
> -	memcpy(oldmsg, msg, sizeof(*msg) + msg->pathlen);
> +	memset(oldmsg, 0, sizeof(*msg) + pseudo_path_max());

Originally we did NOT memset this, as we expected the memcpy to overwrite any existing memory.  This is why the memcpy was the entire old msg to the new msg entry.

> +	/* Copy path and pathlen from msg to oldmsg */
> +	memcpy(oldmsg->path, msg->path, msg->pathlen);
> +	oldmsg->pathlen = msg->pathlen;
>   	found_existing = !pdb_find_file_path(oldmsg);

The function above (pdb_find_file_path) was supposed to clear items from oldmsg (as necessary), basically resetting the dev/ino and any other stuff that this function cares about.  (That may no longer be true, I'm really not sure!)

I'm not against memset and then partial copy, but do we need any of the other fields?  This is the part I'm really unclear about.  (Especially uid/gid/mode and op.)

> -	if (found_existing) {
> -		/* we have an existing file entry */
> +	if (found_existing && (msg->dev!=oldmsg->dev || msg->ino!=oldmsg->ino)) {
> +		/* we have an existing file entry with different dev/inode */

The above change looks correct to me.  I'm not sure why this wasn't done original, maybe an oversight.

>   		pdb_copy_xattrs(oldmsg, msg);
>   	}

if you don't do the xattr copy then is the issue that we have 'junk' in the 'oldmsg' that needs to be cleared?  Or are there other fields in oldmsg that are the concern?

Unless there is a specific reason, I would rather copy of the msg -> oldmsg and then clear just the items we do not want.  This should clear old data and give us a known starting point for the "new" oldmsg.  We can then sanitize/memset/clear whatever fields we don't care about.

>   	sqlite3_bind_int(update, 1, msg->dev);
Hongxu Jia Nov. 13, 2025, 11:49 a.m. UTC | #2
On 11/13/25 01:58, Mark Hatle wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender 
> and know the content is safe.
>
> I don't completely understand what is going wrong here.  Some comments 
> below to the patch that might help me understand it.
>
> On 11/11/25 7:59 AM, Hongxu Jia wrote:
>> In commit [fbaffe9 Handle rename(3) across devices.], it defines 
>> function
>> pdb_update_inode to change dev/inode for a given path -- used only by
>> RENAME for now.
>>
>> In commit [58e4171 xattr work: handle deletes and renames], in function
>> pdb_update_inode, it defines oldmsg to test the existence of msg->path.
>> If yes, copy xattrs from oldmsg to msg. If both of oldmsg and msg have
>> the same dev/inode, it caused duplicated xattrs in the given path.
>> Here are the reproducible steps:
>>
>> 1) Create a file in rootfs
>> mkdir -p rootfs; touch rootfs/file
>>
>> 2) Use IMA/EVM signing utility to set xattr
>> evmctl ima_sign --hashalgo sha256 --key path-to/ima_keys/x509_ima.key 
>> --pass=xxx ./rootfs/file
>>
>> 3) List xattr, get security.ima
>> $ getfattr -d -m ^ -R -- rootfs/
>> - rootfs/
>> |# file: rootfs//file
>> security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw== 
>>
>>
>> 4) Rename
>> $ mv rootfs/file rootfs/rename_file
>>
>> 5) List xattr, found duplicated security.ima
>> $ getfattr -d -m ^ -R -- rootfs/
>> - rootfs/
>> |# file: rootfs//rename_file
>> security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw== 
>>
>> security.ima=0sAwIEOI1nBgEASdQgr4UDRTIYmF7q5SHFsyt3J/TO0Pwwz/hDiV3m4kjWALSKGC0Ea7wel6cZ1CcA/hQQ6gW5e86kswTCFexmM6qdE1MVG1IrP1L9YZcpJ3Ghh5bJS8S7Kb/J2FCaaPW9skGn8vG339CRHUuNz2z6uy6pryjLAVoAyl0jdqDlUTmhzfd7Bq6i1HLGSyBxm9nPwFDe4E5ubpwJlrN11YHUJSWVoAmpMfPLgu2bDalAz/I6tTMFdrEoR7SH0F9LuwKGnq4CAzrpsts/8/islsKWWPmmalv635EGG/k7g7zY0D/cHjc3YyaMYtvd43OWbTu+2j6w8IywmaZXN/YZg4AMgw== 
>>
>>
>> This commit:
>>
>> - Due to oldmsg is static variable, clean up the memory of oldmsg
>>    to avoid dirty data from last call.
>
> See below, but the memcpy of the new msg should both setup the oldmsg 
> and clear any old data.  It was expected that subsequent calls would 
> clear any specific data we didn't want preserved (for whatever reason) 
> -- See below.
>
I am afraid the memcpy of new to old msg could not completely clean up  
the path buffer, such as

At 1st call, if msg->path = "abcde", msg->pathlen = 5, then oldmsg->path 
= "abcde", oldmsg->pathlen = 5;

At 2ed call, if msg->path = "efg", msg->pathlen = 3; if we use memcpy,  
then oldmsg->path = "efgde", oldmsg->pathlen = 3; and then input oldmsg 
to function pdb_find_file_path, only msg->path is used for 
sqlite3_bind_text,

it search "efgde", other than expected "efg"


>> - Other than copy the whole msg to oldmsg, only copy path and
>>    pathlen from msg to oldmsg, use function pdb_find_file_path
>>    to fill oldmsg->dev and oldmsg->ino if it existed
>>
>> - Copy xattr only if the existed oldmsg have different dev/inode
>>    with msg to avoid duplicated xattr for the same dev/indoe
>>
>> Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
>> ---
>>   pseudo_db.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/pseudo_db.c b/pseudo_db.c
>> index 8b23938..d6ae1d3 100644
>> --- a/pseudo_db.c
>> +++ b/pseudo_db.c
>> @@ -2188,10 +2188,13 @@ pdb_update_inode(pseudo_msg_t *msg) {
>>               pseudo_diag("Can't update the inode of a file without 
>> its path.\n");
>>               return 1;
>>       }
>> -     memcpy(oldmsg, msg, sizeof(*msg) + msg->pathlen);
>> +     memset(oldmsg, 0, sizeof(*msg) + pseudo_path_max());
>
> Originally we did NOT memset this, as we expected the memcpy to 
> overwrite any existing memory.  This is why the memcpy was the entire 
> old msg to the new msg entry.
>
See my above comments
>> +     /* Copy path and pathlen from msg to oldmsg */
>> +     memcpy(oldmsg->path, msg->path, msg->pathlen);
>> +     oldmsg->pathlen = msg->pathlen;
>>       found_existing = !pdb_find_file_path(oldmsg);
>
> The function above (pdb_find_file_path) was supposed to clear items 
> from oldmsg (as necessary), basically resetting the dev/ino and any 
> other stuff that this function cares about.  (That may no longer be 
> true, I'm really not sure!)
>
The reason why I explicitly operate oldmsg->path and oldmsg->pathlen is 
tell developer only them are used by pdb_find_file_path as input,

and pdb_find_file_path will fill dev/ino if true, and dev/ino will be 
empty if pdb_find_file_path return false

> I'm not against memset and then partial copy, but do we need any of 
> the other fields?  This is the part I'm really unclear about. 
> (Especially uid/gid/mode and op.)
>
In table xattr of sqlite, it defines dev, ino, name, value, so dev and 
ino of msg is used, and uid/gid/mode are not


>> -     if (found_existing) {
>> -             /* we have an existing file entry */
>> +     if (found_existing && (msg->dev!=oldmsg->dev || 
>> msg->ino!=oldmsg->ino)) {
>> +             /* we have an existing file entry with different 
>> dev/inode */
>
> The above change looks correct to me.  I'm not sure why this wasn't 
> done original, maybe an oversight.
>
The issue was found after commit [d1db9c2 pseudo_db.c: Fix xattr memory 
allocation] in which make the logic works.


>>               pdb_copy_xattrs(oldmsg, msg);
>>       }
>
> if you don't do the xattr copy then is the issue that we have 'junk' 
> in the 'oldmsg' that needs to be cleared?  Or are there other fields 
> in oldmsg that are the concern?

I think we should not do the xattr copy, but I respect the design of 
original author, so I just add the filter condition 
(msg->dev!=oldmsg->dev || msg->ino!=oldmsg->ino)

I have no idea on what situation for a given path, the dev/inode will be 
changed, for file rename operation, usually a given dev/inode, the path 
changed


>
> Unless there is a specific reason, I would rather copy of the msg -> 
> oldmsg and then clear just the items we do not want.  This should 
> clear old data and give us a known starting point for the "new" 
> oldmsg.  We can then sanitize/memset/clear whatever fields we don't 
> care about.
>
Only memcpy is not sufficient for oldmsg, the oldmsg->path buffer, see 
my above comments

//Hongxu

>>       sqlite3_bind_int(update, 1, msg->dev);
>
diff mbox series

Patch

diff --git a/pseudo_db.c b/pseudo_db.c
index 8b23938..d6ae1d3 100644
--- a/pseudo_db.c
+++ b/pseudo_db.c
@@ -2188,10 +2188,13 @@  pdb_update_inode(pseudo_msg_t *msg) {
 		pseudo_diag("Can't update the inode of a file without its path.\n");
 		return 1;
 	}
-	memcpy(oldmsg, msg, sizeof(*msg) + msg->pathlen);
+	memset(oldmsg, 0, sizeof(*msg) + pseudo_path_max());
+	/* Copy path and pathlen from msg to oldmsg */
+	memcpy(oldmsg->path, msg->path, msg->pathlen);
+	oldmsg->pathlen = msg->pathlen;
 	found_existing = !pdb_find_file_path(oldmsg);
-	if (found_existing) {
-		/* we have an existing file entry */
+	if (found_existing && (msg->dev!=oldmsg->dev || msg->ino!=oldmsg->ino)) {
+		/* we have an existing file entry with different dev/inode */
 		pdb_copy_xattrs(oldmsg, msg);
 	}
 	sqlite3_bind_int(update, 1, msg->dev);