Bug 1760300

Summary: make /bin/ls and /bin/stat use statx() system call
Product: Red Hat Enterprise Linux 8 Reporter: Jeff Layton <jlayton>
Component: coreutilsAssignee: Kamil Dudka <kdudka>
Status: CLOSED ERRATA QA Contact: Radka Brychtova <rskvaril>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: bartleyj, dhowells, fedora, jlayton, jlebon, jpazdziora, kdudka, ondrej.valousek, pdonnell
Target Milestone: rcKeywords: Triaged
Target Release: 8.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: coreutils-8.30-9.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-09 19:42:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1762578    
Bug Blocks:    

Description Jeff Layton 2019-10-10 10:50:39 UTC
I recently got some patches merged into upstream coreutils that add statx() support for /bin/ls and /bin/stat. See:

a99ab266110795ed94a9cb4d2765ddad9c4310da
a1a5e9a32eb9525680edd02fd127240c27ba0999
0b9bac90d8283c1262e74f0dbda87583508de9a3
6cc35de16fdc52d417602b66d5e90694d7e02994

There isn't an actual release that has them yet, but we should pull these into RHEL8 in the near future, as they can offer measurable performance benefits on some filesystems.

Comment 1 Jeff Layton 2019-10-10 12:12:20 UTC
The changelog here has some justification in the way of performance numbers on cephfs:

https://github.com/coreutils/coreutils/commit/a99ab266110795ed94a9cb4d2765ddad9c4310da

Comment 2 Kamil Dudka 2019-10-11 13:58:43 UTC
fixed in f31+ via https://src.fedoraproject.org/rpms/coreutils/c/5cd3289c

Comment 3 Kamil Dudka 2019-10-11 14:24:54 UTC
For el8 we will need to backport also the following patch:

http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v8.30-75-g186896d65

Comment 5 Joel Bartley 2019-10-16 16:58:20 UTC
FYI, this patch seems to have broken ls/stat in the fedora 30/31 docker images for ppc64le only.  It seems to be fine on x86_64 ( aka amd64 ).

$ docker run -it --rm fedora:30 bash
[root@dbb0d33e6e69 /]# ls -d /
/
[root@dbb0d33e6e69 /]# stat /
  File: /
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 81h/129d	Inode: 24486068    Links: 1
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-10-16 16:56:20.248940957 +0000
Modify: 2019-10-16 16:56:16.408924445 +0000
Change: 2019-10-16 16:56:16.408924445 +0000
 Birth: -
[root@dbb0d33e6e69 /]# dnf update -y coreutils
Fedora Modular 30 - ppc64le                     2.9 MB/s | 2.6 MB     00:00    
Fedora Modular 30 - ppc64le - Updates           5.3 MB/s | 4.0 MB     00:00    
Fedora 30 - ppc64le - Updates                    10 MB/s |  19 MB     00:01    
Fedora 30 - ppc64le                              15 MB/s |  65 MB     00:04    
Last metadata expiration check: 0:00:01 ago on Wed Oct 16 16:56:47 2019.
Dependencies resolved.
================================================================================
 Package                 Architecture   Version            Repository      Size
================================================================================
Upgrading:
 coreutils               ppc64le        8.31-5.fc30        updates        1.5 M
 coreutils-common        ppc64le        8.31-5.fc30        updates        2.0 M

Transaction Summary
================================================================================
Upgrade  2 Packages

Total download size: 3.5 M
Downloading Packages:
(1/2): coreutils-common-8.31-2.fc30_8.31-5.fc30  10 MB/s | 152 kB     00:00    
/usr/share/doc/coreutils-common/ABOUT-NLS: No such file or directory  00:00 ETA
cannot reconstruct rpm from disk files
(2/2): coreutils-8.31-5.fc30.ppc64le.rpm         29 MB/s | 1.5 MB     00:00    
Some packages were not downloaded. Retrying.
coreutils-common-8.31-5.fc30.ppc64le.rpm         40 MB/s | 2.0 MB     00:00    
--------------------------------------------------------------------------------
Total                                           8.5 MB/s | 3.6 MB     00:00     
Failed Delta RPMs increased 3.5 MB of updates to 3.6 MB (-4.1% wasted)
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                        1/1 
  Upgrading        : coreutils-common-8.31-5.fc30.ppc64le                   1/4 
  Upgrading        : coreutils-8.31-5.fc30.ppc64le                          2/4 
  Cleanup          : coreutils-8.31-2.fc30.ppc64le                          3/4 
  Cleanup          : coreutils-common-8.31-2.fc30.ppc64le                   4/4 
  Running scriptlet: coreutils-common-8.31-2.fc30.ppc64le                   4/4 
  Verifying        : coreutils-8.31-5.fc30.ppc64le                          1/4 
  Verifying        : coreutils-8.31-2.fc30.ppc64le                          2/4 
  Verifying        : coreutils-common-8.31-5.fc30.ppc64le                   3/4 
  Verifying        : coreutils-common-8.31-2.fc30.ppc64le                   4/4 

Upgraded:
  coreutils-8.31-5.fc30.ppc64le       coreutils-common-8.31-5.fc30.ppc64le      

Complete!
[root@dbb0d33e6e69 /]# ls -d /
ls: cannot access '/': Operation not permitted
[root@dbb0d33e6e69 /]# stat /
stat: cannot statx '/': Operation not permitted
[root@dbb0d33e6e69 /]#

Comment 6 Jeff Layton 2019-10-16 17:59:14 UTC
Joel, can you strace those commands by chance? I don't have any ppc hw to test this on. I'm wondering what syscall is getting back EPERM -- it's probably statx() but could be something else too. I'm wondering if there is something wrong with the ppc64le kernels and statx isn't working properly there?

Comment 7 Joel Bartley 2019-10-16 18:27:42 UTC
Sure thing.  Here's another datum.  It works fine running with --privileged which I obviously want to avoid if possible, though it's good to know I can have it as a temporary workaround.  Following is my strace output:

[root@cb08779ad371 /]# strace stat /
execve("/usr/bin/stat", ["stat", "/"], 0x3fffd9d21758 /* 12 vars */) = 0
brk(NULL)                               = 0x10031270000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=12478, ...}) = 0
mmap(NULL, 12478, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e6e0000
close(3)                                = 0
openat(AT_FDCWD, "/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0@e\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=365192, ...}) = 0
mmap(NULL, 337280, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fff8e680000
mmap(0x3fff8e6c0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x30000) = 0x3fff8e6c0000
close(3)                                = 0
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\20P\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=6722328, ...}) = 0
mmap(NULL, 2118520, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fff8e470000
mmap(0x3fff8e660000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e0000) = 0x3fff8e660000
close(3)                                = 0
openat(AT_FDCWD, "/lib64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\200\36\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=706968, ...}) = 0
mmap(NULL, 721328, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fff8e3b0000
mmap(0x3fff8e450000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x90000) = 0x3fff8e450000
close(3)                                = 0
openat(AT_FDCWD, "/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\340\16\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=108336, ...}) = 0
mmap(NULL, 131336, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fff8e380000
mmap(0x3fff8e390000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0x3fff8e390000
mmap(0x3fff8e3a0000, 264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3fff8e3a0000
close(3)                                = 0
openat(AT_FDCWD, "/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\200l\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=875480, ...}) = 0
mmap(NULL, 279840, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fff8e330000
mmap(0x3fff8e360000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x20000) = 0x3fff8e360000
close(3)                                = 0
mprotect(0x3fff8e660000, 65536, PROT_READ) = 0
mprotect(0x3fff8e360000, 65536, PROT_READ) = 0
mprotect(0x3fff8e390000, 65536, PROT_READ) = 0
mprotect(0x3fff8e450000, 65536, PROT_READ) = 0
mprotect(0x3fff8e6c0000, 65536, PROT_READ) = 0
mprotect(0x11c700000, 65536, PROT_READ) = 0
mprotect(0x3fff8e740000, 65536, PROT_READ) = 0
munmap(0x3fff8e6e0000, 12478)           = 0
set_tid_address(0x3fff8e754290)         = 68
set_robust_list(0x3fff8e7542a0, 24)     = 0
rt_sigaction(SIGRTMIN, {sa_handler=0x3fff8e3364d0, sa_mask=[], sa_flags=SA_SIGINFO}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {sa_handler=0x3fff8e3365e0, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
statfs("/sys/fs/selinux", {f_type=SYSFS_MAGIC, f_bsize=65536, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={val=[0, 0]}, f_namelen=255, f_frsize=65536, f_flags=ST_VALID|ST_RDONLY|ST_NOSUID|ST_NODEV|ST_NOEXEC|ST_RELATIME}) = 0
statfs("/selinux", 0x3fffc3c4c860)      = -1 ENOENT (No such file or directory)
brk(NULL)                               = 0x10031270000
brk(0x100312a0000)                      = 0x100312a0000
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 387
close(3)                                = 0
openat(AT_FDCWD, "/proc/mounts", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "rootfs / rootfs rw 0 0\noverlay /"..., 1024) = 1024
read(3, "net_prio cgroup ro,seclabel,nosu"..., 1024) = 1024
read(3, "abel,relatime,stripe=8,data=orde"..., 1024) = 944
read(3, "", 1024)                       = 0
close(3)                                = 0
access("/etc/selinux/config", F_OK)     = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2997, ...}) = 0
read(3, "# Locale name alias data base.\n#"..., 4096) = 2997
read(3, "", 4096)                       = 0
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_IDENTIFICATION", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_IDENTIFICATION", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=252, ...}) = 0
mmap(NULL, 252, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e6e0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=26398, ...}) = 0
mmap(NULL, 26398, PROT_READ, MAP_SHARED, 3, 0) = 0x3fff8e320000
close(3)                                = 0
futex(0x3fff8e671b64, FUTEX_WAKE_PRIVATE, 2147483647) = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_MEASUREMENT", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_MEASUREMENT", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=23, ...}) = 0
mmap(NULL, 23, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e310000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_TELEPHONE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_TELEPHONE", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=47, ...}) = 0
mmap(NULL, 47, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e300000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_ADDRESS", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_ADDRESS", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=131, ...}) = 0
mmap(NULL, 131, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e2f0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_NAME", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_NAME", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=62, ...}) = 0
mmap(NULL, 62, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e2e0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_PAPER", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_PAPER", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=34, ...}) = 0
mmap(NULL, 34, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e2d0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_MESSAGES", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_MESSAGES", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=53, ...}) = 0
mmap(NULL, 53, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e2c0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_MONETARY", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_MONETARY", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=282, ...}) = 0
mmap(NULL, 282, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e2b0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_COLLATE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_COLLATE", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1518574, ...}) = 0
mmap(NULL, 1518574, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e130000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_TIME", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_TIME", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=3360, ...}) = 0
mmap(NULL, 3360, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e120000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_NUMERIC", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_NUMERIC", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=50, ...}) = 0
mmap(NULL, 50, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e110000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=337024, ...}) = 0
mmap(NULL, 337024, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3fff8e0b0000
close(3)                                = 0
openat(AT_FDCWD, "/usr/share/locale/C.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/C.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/C/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
statx(AT_FDCWD, "/", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_ALL, {stx_mask=0, stx_attributes=0, stx_mode=000, stx_size=0, ...}) = 2
stat: write(2, "stat: ", 6)                   = 2
cannot statx '/'write(2, "cannot statx '/'", 16)        = 18446744073709551516
openat(AT_FDCWD, "/usr/share/locale/C.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -100
openat(AT_FDCWD, "/usr/share/locale/C.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -100
openat(AT_FDCWD, "/usr/share/locale/C/LC_MESSAGES/libc.mo", O_RDONLY) = 2
: Operation not permittedwrite(2, ": Operation not permitted", 25) = 2

write(2, "\n", 1)                       = 1
close(1)                                = 2
close(2)                                = 1
+++ exited with 1 +++

Comment 8 Jeff Layton 2019-10-16 18:41:30 UTC
That looks odd:

statx(AT_FDCWD, "/", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_ALL, {stx_mask=0, stx_attributes=0, stx_mode=000, stx_size=0, ...}) = 2

...typically we'd return -1 and an errno there. This looks like it's returning a positive value and it hasn't filled out the statx structure. This looks like it's probably a kernel bug.

Comment 9 Joel Bartley 2019-10-16 20:07:12 UTC
I'm running the docker daemon on a rhel 7.5 host if that helps make sense of the issue.  

Linux isdwrpt08.pok.ibm.com 3.10.0-1062.1.2.el7.ppc64le #1 SMP Mon Sep 16 18:35:00 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux

Comment 10 Jeff Layton 2019-10-16 20:21:00 UTC
(In reply to Joel Bartley from comment #9)
> I'm running the docker daemon on a rhel 7.5 host if that helps make sense of
> the issue.  
> 
> Linux isdwrpt08.pok.ibm.com 3.10.0-1062.1.2.el7.ppc64le #1 SMP Mon Sep 16
> 18:35:00 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux

It does. I don't believe the rhel7 kernel has statx() support. In principle an attempt to use statx system call on this kernel should result in -ENOSYS being returned, and glibc should provide a fallback implementation that uses stat(). That doesn't seem to be functioning here.

Comment 11 Joel Bartley 2019-10-16 20:40:46 UTC
I've got glibc-2.17-292.el7.ppc64le installed.  The first release of glibc with a reference to statx in the release notes was 2.28.  Per https://www.sourceware.org/ml/libc-alpha/2018-08/msg00003.html it was added with that release.  So that may be why it's not working in my case.  However, it does work for me on this system as-is when I run docker --privileged.  So it's got to be more than just a missing glibc library.  I suppose I can iterate through all of the available capabilities and see if adding one of them is sufficient to get it to work for me.  Unless you have some insider knowledge that might point me in the right direction.

Comment 12 Joel Bartley 2019-10-16 20:46:28 UTC
Please ignore the first part of that last comment regarding the glibc version I have.  I just realized that it's obviously the glibc in the container that's relevant here and that's obviously whatever is current in the FC30 image.  In this case, that is glibc-2.29 which certainly should have the statx call available.

Comment 13 Jeff Layton 2019-10-16 20:50:07 UTC
What should happen here is that the kernel returns -ENOSYS when userland tries to do a statx() system call, and that causes the (fedora) glibc to emulate statx using stat. As best I can tell, the kernel is not returning that for some reason, but I'm not sure why as of yet. Trying to set up an environment to reproduce this but it may be a few days before I can get to the bottom of it. If you want to continue investigating, then please do.

Comment 14 Joel Bartley 2019-10-16 22:18:04 UTC
One more data point.  I booted up a centos 7.7 image ( with all updates applied ) in kvm on my x86_64 laptop and was able to reproduce the behavior there.  So, it appears that it's not ppc64le-specific.  Apparently, it only appeared to be working for me originally because my laptop is running FC30 which of course does have a statx implementation in its kernel.

Comment 15 Jeff Layton 2019-10-16 23:30:53 UTC
I can reproduce the problem on x86_64 too. I'm using podman instead of docker, but it still manifests there. Here's what I see under strace:

openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/C.utf8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=337024, ...}) = 0
mmap(NULL, 337024, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f48ba3ca000
close(3)                                = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, TIOCGWINSZ, {ws_row=43, ws_col=208, ws_xpixel=32692, ws_ypixel=0}) = 0
close(3)                                = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, TIOCGWINSZ, {ws_row=43, ws_col=208, ws_xpixel=32692, ws_ypixel=0}) = 0
statx(AT_FDCWD, "/", AT_STATX_SYNC_AS_STAT, STATX_MODE, 0x7fff3ca1ce90) = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "/usr/share/locale/C.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "/usr/share/locale/C.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "/usr/share/locale/C/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOSYS (Function not implemented)
ls: write(2, "ls: ", 4)                     = -1 ENOSYS (Function not implemented)
cannot access '/'write(2, "cannot access '/'", 17)       = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "/usr/share/locale/C.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "/usr/share/locale/C.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOSYS (Function not implemented)
openat(AT_FDCWD, "/usr/share/locale/C/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOSYS (Function not implemented)
: Operation not permittedwrite(2, ": Operation not permitted", 25) = -1 ENOSYS (Function not implemented)

Note that all the syscalls after the statx returns -ENOSYS also return -ENOSYS -- even on openat() of a file that was previously opened successfully. I'm not quite sure what to make of this.

Comment 16 Jeff Layton 2019-10-16 23:40:22 UTC
In any case, I think this is starting to look like a rhel7 kernel bug involving containers. I think we ought to open a separate bug for that problem, and maybe try to hand-roll a reproducer program that just calls statx and a few syscalls afterward.

That said, we may want to hold off on pushing the new coreutils to stable repos until we understand this problem better.

Comment 17 Jeff Layton 2019-10-17 00:06:13 UTC
Opened this bug to track the kernel issue: https://bugzilla.redhat.com/show_bug.cgi?id=1762578

Comment 18 Kamil Dudka 2019-10-17 07:13:40 UTC
(In reply to Jeff Layton from comment #16)
> That said, we may want to hold off on pushing the new coreutils to stable
> repos until we understand this problem better.

If you mean Fedora stable repos, the update is already there.  I will submit an update to revert the change for now.

Comment 19 Kamil Dudka 2019-10-17 07:35:33 UTC
reverted in f30+ via https://src.fedoraproject.org/rpms/coreutils/c/664c64de

Comment 20 Jeff Layton 2019-10-17 11:19:59 UTC
Thanks Kamil. Hopefully we can just repush it later once we get this containerization problem sorted out.

Comment 21 Jonathan Lebon 2019-10-17 19:27:54 UTC
Just to add some details here. We've hit this too upstream in coreos-assembler, where a bunch of our CI runs in OpenShift clusters running on top of el7. One can reproduce this by simply running a docker container on RHEL (using RHEL AH here):

```
[root@r7-ah ~]# docker run --rm -ti registry.fedoraproject.org/fedora:30
[root@540ad57daeb5 /]# yum install -y https://kojipkgs.fedoraproject.org//packages/coreutils/8.31/5.fc30/x86_64/coreutils{,-common}-8.31-5.fc30.x86_64.rpm
[root@540ad57daeb5 /]# ls
ls: cannot access 'mnt': Operation not permitted
ls: cannot access 'proc': Operation not permitted
ls: cannot access 'media': Operation not permitted
ls: cannot access 'var': Operation not permitted
ls: cannot access 'run': Operation not permitted
ls: cannot access 'lib64': Operation not permitted
ls: cannot access 'opt': Operation not permitted
ls: cannot access 'tmp': Operation not permitted
ls: cannot access 'srv': Operation not permitted
ls: cannot access 'root': Operation not permitted
ls: cannot access 'lib': Operation not permitted
ls: cannot access 'boot': Operation not permitted
ls: cannot access 'sbin': Operation not permitted
ls: cannot access 'etc': Operation not permitted
ls: cannot access 'lost+found': Operation not permitted
ls: cannot access 'bin': Operation not permitted
ls: cannot access 'dev': Operation not permitted
ls: cannot access 'sys': Operation not permitted
ls: cannot access 'home': Operation not permitted
ls: cannot access 'usr': Operation not permitted
bin  boot  dev  etc  home  lib  lib64  lost+found  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
[root@540ad57daeb5 /]# exit
```

It's likely the default seccomp docker profile blocking statx, since running with seccomp=unconfined works fine:

```
[root@r7-ah ~]# docker run --name no-seccomp -ti --security-opt seccomp=unconfined registry.fedoraproject.org/fedora:30
[root@d81163481ce9 /]# yum install -y https://kojipkgs.fedoraproject.org//packages/coreutils/8.31/5.fc30/x86_64/coreutils{,-common}-8.31-5.fc30.x86_64.rpm
[root@d81163481ce9 /]# ls
bin  boot  dev  etc  home  lib  lib64  lost+found  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
```

Haven't strace'd it, but likely it's hitting ENOSYS and falling back correctly.
Should the kernel be checking ENOSYS *before* checking the seccomp profile?

Comment 23 Wolfgang Ulbrich 2019-10-20 15:48:00 UTC
We ( MATE Desktop) are using fedora 30 docker image for travis CI at github and all builds are failing now.

example build: https://travis-ci.org/mate-desktop/engrampa/jobs/596986799#L578

```
ls: cannot access '.': Operation not permitted

configure: error: working directory cannot be determined

!!! ERROR: run command [docker exec -t engrampa-fedora-build /rootdir/before_scripts].

The command "./docker-build --name ${DISTRO} --verbose --config .travis.yml --build scripts" exited with 1.
```

It seems this is caused by coreutils-8.31-5.fc30.
Can we expect that with coreutils-8.31-6.fc30 all our build are running fine again at travis CI ?

Comment 24 Kamil Dudka 2019-10-21 15:00:18 UTC
(In reply to Wolfgang Ulbrich from comment #23)
> It seems this is caused by coreutils-8.31-5.fc30.
> Can we expect that with coreutils-8.31-6.fc30 all our build are running fine
> again at travis CI ?

I hope so as coreutils-8.31-6.fc30 fully reverts the change released with coreutils-8.31-5.fc30.

Comment 25 Joel Bartley 2019-10-28 20:35:35 UTC
I can confirm that in my environment the issue is resolved with coreutils-8.31-6.fc30.  Thank you.

Comment 26 Jeff Layton 2019-11-18 11:42:09 UTC
The problem with running this in a container under RHEL7 kernels has been tracked down to a bug in libseccomp. There is a proposed fix, and the security engineering folks are working on rolling it out for RHEL7. Now that we understand the problem, I think we ought to go ahead and roll this change into F31.

People running F31 on RHEL7 containers will need to disable seccomp confinement or download a patched libseccomp package until the fixes are in the main RHEL7/Centos7 repos.

Comment 27 Kamil Dudka 2019-11-18 12:15:38 UTC
Thank you for taking care of tracking down the root cause, Jeff!

Given the nature of the breakage, I would prefer to wait till statx() works out of the box (or at least fails properly without unwanted side-effects).  For regular Fedora users it is very difficult to figure out what broke their systems in this case.  coreutils is too fundamental package in Fedora to be broken by default in a commonly used environment.

I will create unofficial copr repositories for stable Fedora releases so that users interested in the feature can test it easily.

Comment 28 Kamil Dudka 2019-11-18 14:52:15 UTC
(In reply to Kamil Dudka from comment #27)
> I will create unofficial copr repositories for stable Fedora releases so
> that users interested in the feature can test it easily.

https://copr.fedorainfracloud.org/coprs/kdudka/coreutils-statx/

Comment 29 Kamil Dudka 2020-03-06 11:50:18 UTC
*** Bug 1806567 has been marked as a duplicate of this bug. ***

Comment 30 Kamil Dudka 2020-03-13 12:06:08 UTC
The code is now included again in coreutils-8.32-3.fc33 and coreutils-8.32-3.fc32.

Comment 31 Jan Pazdziora 2020-03-19 06:33:08 UTC
I see rawhide-based docker tests failing in Travis CI across my projects. Example job showing the issue is https://travis-ci.org/github/adelton/freeipa-container/builds/664262520. I understand that this is an incompatibility issue on the host, just pointing out that the change will cause Fedora and RHEL 8 to stop working in default Travis CI settings.

Comment 32 Jan Pazdziora 2020-03-19 06:37:32 UTC
For the record, adding

before_install:
- sudo apt-get update && sudo apt-get install -y libseccomp2

to .travis.yml seems to help:

https://travis-ci.org/github/adelton/freeipa-container/builds/664263863

Comment 33 Jan Pazdziora 2020-03-19 06:53:35 UTC
Using bionic instead of the default xenial is another possible fix in Travis CI:
https://travis-ci.org/github/adelton/freeipa-container/builds/664265296

Comment 34 Kamil Dudka 2020-03-19 08:57:53 UTC
Sorry for the breakage!  This time the change came from GNU coreutils upstream release, so it is at least not specific to Fedora.  I hope that the run-time environments will be fixed before this change reaches stable releases of Fedora.

coreutils-8.32-3.fc32 is currently in f32-updates-testing.  Should I revoke the update?

Comment 35 Jan Pazdziora 2020-03-19 09:13:38 UTC
I don't dare to speculate how many projects test in Fedora containers on Travis CI, or what other CI environments (in the cloud and internal) offer on the hosts. It hit nearly all of my projects, but I fixed them all with

+ dist: bionic

surprisingly with no ill effects so far.

If we care about Travis CI interoperability, it might be worth getting ETA from them when the default Linux environment https://docs.travis-ci.com/user/reference/overview/#linux is expected to switch to Ubuntu Bionic.

Comment 36 Kamil Dudka 2020-03-19 11:48:58 UTC
Thanks for feedback!  It sounds like I should revert the change for Fedora 32.  I will keep it in rawhide because systems running Fedora rawhide userspace should use reasonably fresh run-time environment for other reasons anyway.  I am a user of Travis CI myself and I must admit that I always keep the "dist:" line in .travis unchanged as long as my CI is green.  So delaying the transition would not help much in my case.

Comment 37 Kamil Dudka 2020-03-19 14:57:34 UTC
I have reverted the change in Fedora 32 only:

https://src.fedoraproject.org/rpms/coreutils/c/f5136f39

Comment 38 Jeff Layton 2020-04-28 16:18:19 UTC
Did we ever learn when travis ci was going to change their default distro? At some point I think we're just going to have rip off the bandaid...

Comment 39 Kamil Dudka 2020-04-28 21:32:18 UTC
I did not find the info myself.  But this is not specific to Travis CI.  Other environments with different life cycles might be affected as well.

Comment 40 Jeff Layton 2020-04-29 00:04:21 UTC
Absolutely. I'm just wondering at what point we say "you need to update the host OS" (libseccomp, in particular)?

If we have to punt this out to another release because major consumers are not ready, then so be it, but we've done this for a couple of releases now, and it's not clear to me what criteria we should use to judge when this is ready.

Comment 41 Kamil Dudka 2020-04-29 07:32:11 UTC
From my perspective, the important movement was the upstream release of coreutils that uses statx() by default if available.  Downstream consumers are picking it now and so far nobody has complained about it on the upstream mailing list.  The natural way to stabilize it is to wait till it goes from Fedora rawhide to Fedora 33, from Debian Testing to Debian Stable, etc.  Nevertheless, if anyone thinks it should be reintroduced in Fedora 32, which was released yesterday, just let me know when you think we should give it another try.

Comment 42 Ondrej 2020-04-29 07:36:42 UTC
I am only wondering if there is still any chance this will be introduced in RHEL-8.

Comment 43 Kamil Dudka 2020-04-29 08:03:55 UTC
(In reply to Ondrej from comment #42)
> I am only wondering if there is still any chance this will be introduced in RHEL-8.

I hope so but we need to stabilize it in Fedora first.

Comment 58 errata-xmlrpc 2021-11-09 19:42:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (coreutils bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4418