Bug 2369561 - flistxattr with right size wrongly fails with ERANGE, breaking 'cp -a' etc
Summary: flistxattr with right size wrongly fails with ERANGE, breaking 'cp -a' etc
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 42
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-05-31 14:57 UTC by Paul Eggert
Modified: 2025-06-06 05:48 UTC (History)
15 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Paul Eggert 2025-05-31 14:57:34 UTC
flistxattr fails with ERANGE on a tmpfs file with an ACL, even though the buffer is large enough. The buffer must be considerably larger than what is needed, for flistxattr to succeed.

This causes xattr-related programs to fail when they use B = flistxattr(fd, NULL, 0) to get a buffer size B, and when flistxattr(fd, buffer, B) incorrectly fails the program fails. I ran into this problem with bleeding-edge coreutils, whose 'cp -a' incorrectly fails due to the flistxattr bug. I assume listxattr and llistxattr have similar problems.

I ran into the flistxattr bug recently with Fedora 42 with 'uname -rvm' reporting "6.14.8-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 22 19:26:02 UTC 2025 x86_64". I am not running any modules not shipped directly with Fedora. This is a regression, as there's no bug with RHEL 9.5 with uname -rvm reporting "5.14.0-503.21.1.el9_5.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 19 09:37:00 EST 2024 x86_64". Although I don't know exactly which kernel version introduced the bug it must be recent as I didn't see the bug in earlier Fedora 42 usage. My suspicion is that the bug is related to Stephen Smalley's recent kernel patch <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8b0ba61d>.

To reproduce the problem, use a tmpfs file system. Suppose it's /dev/shm. Run the following commands:

   touch /dev/shm/foo
   setfacl -m user:1:1 /dev/shm/foo
   gcc xattrbug.c # See end of bug report for source code.
   ./a.out </dev/shm/foo

./a.out should quietly exit with status 0. However, it outputs this:

  flistxattr (..., 17) failed with 17-byte xattr!: Numerical result out of range
  flistxattr (..., 41) returned 17, but smaller requests failed!

and then exits with status 1. strace reports the following two lines:

  flistxattr(0, "system.posix_acl_", 10000) = 17
  flistxattr(0, 0x403060, 17) = -1 ERANGE (Numerical result out of range)

and the second line is obviously wrong; flistxattr should succeed and return 17.

Here is the source code for xattrbug.c:

  #include <errno.h>
  #include <stdio.h>
  #include <sys/xattr.h>
  #include <unistd.h>

  char buf[10000];

  int
  main ()
  {
    ssize_t actual_size = flistxattr (STDIN_FILENO, buf, sizeof buf);
    if (actual_size < 0)
      return perror ("flistxattr with big buffer"), 1;

    for (ssize_t s = actual_size; s < sizeof buf; s++)
      {
	ssize_t t = flistxattr (STDIN_FILENO, buf, s);
	if (t < 0)
	  {
	    if (s == actual_size)
	      {
		int e = errno;
		fprintf (stderr,
			 ("flistxattr (..., %lld) failed with %lld-byte xattr"),
			 (long long int) s, (long long int) s);
		errno = e;
		perror ("!");
	      }
	  }
	else
	  {
	    if (s != t)
	      {
		printf (("flistxattr (..., %lld) returned %lld,"
			 " but smaller requests failed!\n"),
			(long long int) s, (long long int) t);
		return 1;
	      }
	    break;
	  }
      }
    return 0;
  }


Reproducible: Always

Comment 1 Paul Eggert 2025-05-31 15:17:56 UTC
Also, the flistxattr kernel bug breaks library functions like attr_copy_fd and attr_copy_file. These library functions can and should work around the kernel bug, so I filed a bug report for that upstream at <https://lists.nongnu.org/archive/html/acl-devel/2025-05/msg00003.html>. You can see the whole listxattr thread at <https://lists.nongnu.org/archive/html/acl-devel/2025-05/threads.html>.

Comment 2 Paul Eggert 2025-06-05 07:05:46 UTC
(In reply to Paul Eggert from comment #1)
> Also, the flistxattr kernel bug breaks library functions like attr_copy_fd
> and attr_copy_file. These library functions can and should work around the
> kernel bug, so I filed a bug report for that upstream

A patch to those library functions was installed (see <https://cgit.git.savannah.gnu.org/cgit/attr.git/commit/?id=58abfe6eba0d8d58a61ee8bee0615f74d393fff2>), so bleeding-edge libxattr on savannah.nongnu.org now works around the kernel bug. However, a new libxattr version hasn't been released and of course it would take some time for any such release to propagate to systems affected by the kernel bug. Also, libxattr needs at least one more patch, namely, patch 0001 in the email <https://lists.gnu.org/r/acl-devel/2025-06/msg00001.html> that I just sent to acl-devel.

Comment 3 Paul Eggert 2025-06-05 20:40:34 UTC
Today Stephen Smalley proposed a simple kernel patch <https://lore.kernel.org/linux-fsdevel/m1wm9qund4.fsf@gmail.com/T/> in the linux-fsdevel mailing list.

Also, bleeding-edge libxattr how has the workaround fix <https://cgit.git.savannah.gnu.org/cgit/attr.git/commit/?id=504ab19d7b032212755ab3c7df16be98d5b5212e>, so bleeding-edge libxattr should now be immune to the kernel bug.

Comment 4 Collin Funk 2025-06-06 05:48:16 UTC
(In reply to Paul Eggert from comment #3)
> Today Stephen Smalley proposed a simple kernel patch
> <https://lore.kernel.org/linux-fsdevel/m1wm9qund4.fsf@gmail.com/T/> in the
> linux-fsdevel mailing list.

I compiled a kernel with this proposed patch and confirmed that it fixes the issue [1].

[1] https://lore.kernel.org/selinux/87plfhsa2r.fsf@gmail.com/


Note You need to log in before you can comment on or make changes to this bug.