Bug 1938459 - suboptimal library versioning of libacl and libattr
Summary: suboptimal library versioning of libacl and libattr
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: acl
Version: 34
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1937956
TreeView+ depends on / blocked
 
Reported: 2021-03-13 14:34 UTC by Kamil Dudka
Modified: 2021-04-01 00:51 UTC (History)
17 users (show)

Fixed In Version: acl-2.3.1-1.fc34
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1937956
Environment:
Last Closed: 2021-04-01 00:51:35 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Kamil Dudka 2021-03-13 14:34:29 UTC
+++ This bug was initially created as a clone of Bug #1937956 +++

--- Additional comment from Lars S. Jensen on 2021-03-13 13:27:04 CET ---

The upgrade of fc34 trigger a lot from missing lilbacl.1.so errors by postinstall scripts for grub2/kernel
Upgraded: libacl-2.3.0-1.fc34.x86_64
There was no kernel files for version -5.11.6-300.fc34.x86_64 in /boot and grub is TBD:

 cp: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory
warning: %posttrans(grub2-common-1:2.04-39.fc34.noarch) scriptlet failed, exit status 127
sed: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory
...
head: write error: Broken pipe
sed: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory
head: write error: Broken pipe
cp: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory
dracut: dracut: creation of /boot/initramfs-5.11.6-300.fc34.x86_64.img failed
warning: %posttrans(kernel-core-5.11.6-300.fc34.x86_64) scriptlet failed, exit status 1

Workaround was to reinstall grub2/kernel before reboot:
dnf reinstall kernel-*-*5.11.6-300.fc34  kernel-*5.11.6-300.fc34  grub2-*

--- Additional comment from Kamil Dudka on 2021-03-13 14:12:07 CET ---

Honestly, I have no idea what happened on your system.  I am not able to reproduce it but will revoke the f34 update to stay on the safe side.

Could you please check what happened with libacl.so.1 ?

Please paste the output of the following commands:

# ldd /usr/bin/cp

# ls -l /usr/lib64/libacl.so*

# stat /usr/lib64/libacl.so.1 $(realpath /usr/lib64/libacl.so.1)

# rpm -V libacl

--- Additional comment from Lars S. Jensen on 2021-03-13 14:41:50 CET ---

This PC have been system-upgrade from FC33 10 day ago and the last upgrade before was 6 days ago.

ldd /usr/bin/cp
	linux-vdso.so.1 (0x00007ffc6c7c7000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f87fbac1000)
	libacl.so.1 => /lib64/libacl.so.1 (0x00007f87fbab6000)
	libattr.so.1 => /lib64/libattr.so.1 (0x00007f87fbaae000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f87fb8df000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f87fb848000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f87fb841000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f87fbb29000)

  ls -l /usr/lib64/libacl.so*
lrwxrwxrwx. 1 root root    17 Mar 13 11:45 /usr/lib64/libacl.so.1 -> libacl.so.1.1.230
-rwxr-xr-x. 1 root root 41208 Mar 12 09:32 /usr/lib64/libacl.so.1.1.230
stat /usr/lib64/libacl.so.1 $(realpath /usr/lib64/libacl.so.1)
  File: /usr/lib64/libacl.so.1 -> libacl.so.1.1.230
  Size: 17        	Blocks: 8          IO Block: 4096   symbolic link
Device: 23h/35d	Inode: 1660415     Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:lib_t:s0
Access: 2021-03-13 11:45:26.083672539 +0100
Modify: 2021-03-13 11:45:26.083672539 +0100
Change: 2021-03-13 11:45:26.083672539 +0100
 Birth: 2021-03-13 11:45:26.083672539 +0100
  File: /usr/lib64/libacl.so.1.1.230
  Size: 41208     	Blocks: 88         IO Block: 4096   regular file
Device: 23h/35d	Inode: 1618674     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:lib_t:s0
Access: 2021-03-12 09:32:45.000000000 +0100
Modify: 2021-03-12 09:32:45.000000000 +0100
Change: 2021-03-13 11:42:52.932981512 +0100
 Birth: 2021-03-13 11:42:52.931981522 +0100

rpm -V libacl
#no issue:
rpm -V libacl --verbose 
.........  a /usr/lib/.build-id
.........  a /usr/lib/.build-id/70
.........  a /usr/lib/.build-id/70/94938cb34f623676283e98762b40867adba416
.........    /usr/lib64/libacl.so.1
.........    /usr/lib64/libacl.so.1.1.230


The upgrade was done with:
dnf --verbose --allowerasing  --obsoletes distro-sync  --refresh --skip-broken.

Can it be the creation of the link to libacl.so.1.1.230 that is delayed

2021-03-13T11:42:52+0100 SUBDEBUG Upgrade: libibverbs-34.0-2.fc34.x86_64
2021-03-13T11:42:52+0100 SUBDEBUG Upgrade: libattr-2.5.0-1.fc34.x86_64
2021-03-13T11:42:52+0100 SUBDEBUG Upgrade: libacl-2.3.0-1.fc34.x86_64
2021-03-13T11:42:52+0100 SUBDEBUG Upgrade: acl-2.3.0-1.fc34.x86_64
2021-03-13T11:42:52+0100 SUBDEBUG Upgrade: librdmacm-34.0-2.fc34.x86_64
2021-03-13T11:42:53+0100 SUBDEBUG Upgrade: librados2-2:16.1.0-0.5.snapshot.fc34.x86_64

2021-03-13T11:44:51+0100 SUBDEBUG Upgraded: mariadb-common-3:10.5.8-8.fc34.x86_64
2021-03-13T11:44:51+0100 SUBDEBUG Upgraded: libacl-2.2.53-10.fc34.x86_64
2021-03-13T11:44:51+0100 SUBDEBUG Upgraded: perl-NDBM_File-1.15-472.fc34.x86_64
2021-03-13T11:44:51+0100 SUBDEBUG Upgraded: perl-libs-4:5.32.1-472.fc34.x86_64


2021-03-13T11:44:54+0100 SUBDEBUG Upgraded: btrfs-progs-5.10-2.fc34.x86_64
2021-03-13T11:44:54+0100 INFO cp: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory
warning: %posttrans(grub2-common-1:2.04-39.fc34.noarch) scriptlet failed, exit status 127

2021-03-13T11:44:54+0100 ERROR Error in POSTTRANS scriptlet in rpm package grub2-common
2021-03-13T11:44:55+0100 INFO systemctl: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory



I have 2 other PC with FC34 i can give it a try and one FC33 that i would like to give system-upgrade.

--- Additional comment from Lars S. Jensen on 2021-03-13 14:52:18 CET ---


Can it be the creation of the link to libacl.so.1.1.230 that is delayed?

This link is create after the error:

lrwxrwxrwx. 1 root root    17 Mar 13 11:45 /usr/lib64/libacl.so.1 -> libacl.so.1.1.230
Access: 2021-03-13 11:45:26.083672539 +0100


2021-03-13T11:44:54+0100 INFO cp: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory
warning: %posttrans(grub2-common-1:2.04-39.fc34.noarch) scriptlet failed, exit status 127

--- Additional comment from Lars S. Jensen on 2021-03-13 15:16:45 CET ---

This link is create 1s after the last error in dnf.rpm.log this is also the last entry:
dracut: dracut: creation of /boot/initramfs-5.11.6-300.fc34.x86_64.img failed
warning: %posttrans(kernel-core-5.11.6-300.fc34.x86_64) scriptlet failed, exit status 1

2021-03-13T11:45:25+0100 ERROR Error in POSTTRANS scriptlet in rpm package kernel-core

but still in dnf transaction dnf.log:
2021-03-13T11:45:37+0100 DDEBUG RPM transaction over.
2021-03-13T11:45:43+0100 DDEBUG timer: verify transaction: 5492 ms

--- Additional comment from Kamil Dudka on 2021-03-13 15:30:49 CET ---

The resulting state seems correct to me.  Both libacl.so.1 and libacl.so.1.1.230 are installed by a single RPM package.  I can see with strace that the files are installed in a suboptimal order (the symlink first) but unless something else tries to load libacl.so.1 in the critical window during installation of a single RPM package, everything should just work.  I know that `rpm` itself links against libacl but if the updates runs sequentially, this should also be fine.

In any case, it is unclear what libacl could do better.  I will clone this bug for dnf.

Comment 1 Lars S. Jensen 2021-03-13 14:48:08 UTC
I don't thinks this is a libacl bug but in dnf because more things is missing: 
It look like the links and libs also llibattr is missing: dracut has log snapshot of issue in systemtd:
Mar 13 11:45:23 ask.lasj.home dracut[9411]: -rwxr-xr-x   1 root     root       300160 Jan 26 05:37 usr/lib64/ld-2.33.so
Mar 13 11:45:23 ask.lasj.home dracut[9414]: lrwxrwxrwx   1 root     root           10 Jan 26 05:37 usr/lib64/ld-linux-x86-64.so.2 -> ld-2.33.so
Mar 13 11:45:23 ask.lasj.home dracut[9417]: lrwxrwxrwx   1 root     root           29 Jan 26 05:37 usr/lib64/libaudit.so.1 -> ../../lib64/libaudit.so.1.0.0
Mar 13 11:45:23 ask.lasj.home dracut[9420]: -rwxr-xr-x   1 root     root       131552 Jan 26 05:37 usr/lib64/libaudit.so.1.0.0
Mar 13 11:45:23 ask.lasj.home dracut[9423]: lrwxrwxrwx   1 root     root           29 Jan 26 05:37 usr/lib64/libblkid.so.1 -> ../../lib64/libblkid.so.1.1.0
Mar 13 11:45:23 ask.lasj.home dracut[9426]: -rwxr-xr-x   1 root     root       217320 Jan 26 05:37 usr/lib64/libblkid.so.1.1.0
Mar 13 11:45:23 ask.lasj.home dracut[9429]: lrwxrwxrwx   1 root     root           36 Jan 26 05:37 usr/lib64/libbrotlicommon.so.1 -> ../../lib64/libbrotli>
Mar 13 11:45:23 ask.lasj.home dracut[9432]: -rwxr-xr-x   1 root     root       138744 Jan 26 03:02 usr/lib64/libbrotlicommon.so.1.0.9
Mar 13 11:45:23 ask.lasj.home dracut[9435]: lrwxrwxrwx   1 root     root           33 Jan 26 05:37 usr/lib64/libbrotlidec.so.1 -> ../../lib64/libbrotlidec>
Mar 13 11:45:23 ask.lasj.home dracut[9438]: -rwxr-xr-x   1 root     root        53080 Jan 26 03:02 usr/lib64/libbrotlidec.so.1.0.9
Mar 13 11:45:23 ask.lasj.home dracut[9441]: lrwxrwxrwx   1 root     root           27 Jan 26 05:37 usr/lib64/libbz2.so.1 -> ../../lib64/libbz2.so.1.0.8
Mar 13 11:45:23 ask.lasj.home dracut[9444]: -rwxr-xr-x   1 root     root        76824 Jan 26 05:37 usr/lib64/libbz2.so.1.0.8
Mar 13 11:45:23 ask.lasj.home dracut[9447]: -rwxr-xr-x   1 root     root      3228320 Jan 26 05:37 usr/lib64/libc-2.33.so


Here work the kernel reinstall:
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root       300160 Jan 26 05:37 usr/lib64/ld-2.33.so
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           10 Jan 26 05:37 usr/lib64/ld-linux-x86-64.so.2 -> ld-2.33.so
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           29 Jan 26 05:37 usr/lib64/libacl.so.1 -> ../../lib64/libacl.so.1.1.230
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root        41208 Jan 26 05:37 usr/lib64/libacl.so.1.1.230
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           30 Jan 26 05:37 usr/lib64/libattr.so.1 -> ../../lib64/libattr.so.1.1.2>
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root        28656 Jan 26 05:37 usr/lib64/libattr.so.1.1.250
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           29 Jan 26 05:37 usr/lib64/libaudit.so.1 -> ../../lib64/libaudit.so.1.0>
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root       131552 Jan 26 05:37 usr/lib64/libaudit.so.1.0.0
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           29 Jan 26 05:37 usr/lib64/libblkid.so.1 -> ../../lib64/libblkid.so.1.1>
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root       217320 Jan 26 05:37 usr/lib64/libblkid.so.1.1.0
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           36 Jan 26 05:37 usr/lib64/libbrotlicommon.so.1 -> ../../lib64/libbrotl>
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root       138744 Jan 26 03:02 usr/lib64/libbrotlicommon.so.1.0.9
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           33 Jan 26 05:37 usr/lib64/libbrotlidec.so.1 -> ../../lib64/libbrotlide>
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root        53080 Jan 26 03:02 usr/lib64/libbrotlidec.so.1.0.9
Mar 13 12:57:24 ask.lasj.home dracut[53527]: lrwxrwxrwx   1 root     root           27 Jan 26 05:37 usr/lib64/libbz2.so.1 -> ../../lib64/libbz2.so.1.0.8
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root        76824 Jan 26 05:37 usr/lib64/libbz2.so.1.0.8
Mar 13 12:57:24 ask.lasj.home dracut[53527]: -rwxr-xr-x   1 root     root      3228320 Jan 26 05:37 usr/lib64/libc-2.33.so

Comment 2 Lars S. Jensen 2021-03-13 15:50:31 UTC
libattr have the same issue:
this link is update after the rpm writing was done: 
2021-03-13T11:42:52+0100 SUBDEBUG Upgrade: libattr-2.5.0-1.fc34.x86_64

This was the last entry in 
2021-03-13T11:45:25+0100 ERROR Error in POSTTRANS scriptlet in rpm package kernel-core

 ls -l /usr/lib64/libattr.so*
lrwxrwxrwx. 1 root root    18 Mar 13 11:45 /usr/lib64/libattr.so.1 -> libattr.so.1.1.250
-rwxr-xr-x. 1 root root 28656 Mar 12 09:29 /usr/lib64/libattr.so.1.1.250

stat /usr/lib64/libattr.so.1 $(realpath /usr/lib64/libattr.so.1)
  File: /usr/lib64/libattr.so.1 -> libattr.so.1.1.250
  Size: 18        	Blocks: 8          IO Block: 4096   symbolic link
Device: 23h/35d	Inode: 1660416     Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:lib_t:s0
Access: 2021-03-13 11:45:26.083672539 +0100
Modify: 2021-03-13 11:45:26.083672539 +0100
Change: 2021-03-13 11:45:26.083672539 +0100
 Birth: 2021-03-13 11:45:26.083672539 +0100  <-- 1s after dnf.log last entry and dracut run 

Is dnf/dracut using snapshot/owerlay/filesystem transaction that can trigger the problem?

Comment 3 Lars S. Jensen 2021-03-13 16:08:53 UTC
I have done the upgrade on PC that was last upgrade yesterday and that is OK.
So is a bad combination of rpm version.

I will do a system-upgrade on the FC33 machine to check if it is issue.

Comment 4 Lars S. Jensen 2021-03-13 17:51:14 UTC
System-upgrade from FC33 to FC34 work OK.

So is a bad combination of fc34 rpm versions that trigger the issue.

This bug can be close.

And in case someone hit the bug, the workaround was to reinstall grub2/kernel: 
dnf reinstall kernel-*-*5.11.6-300.fc34  kernel-*5.11.6-300.fc34  grub2-*

Comment 5 Kamil Dudka 2021-03-13 19:24:37 UTC
(In reply to Lars S. Jensen from comment #4)
> So is a bad combination of fc34 rpm versions that trigger the issue.

Could you please be more specific about the bad combination?

Any idea how to trigger the failure on demand?

Comment 6 Lars S. Jensen 2021-03-14 07:54:38 UTC
I have tried to reproduce the issue but did not work out:-(

I suspect it is related to a transaction/delay/error in the BTRFS filesystem in kernel 5.11.3 or dnf because the links/libs is there, but is first visible to the other processes (grub/dracut) too late, after dnf is finished with the rpm update.

Comment 7 Kamil Dudka 2021-03-14 14:32:55 UTC
Thank you for clarifying it!  As the failure seems like an intermittent issue, I will push the acl/attr update back to f34-testing.

Comment 8 Daniel Mach 2021-03-15 12:30:08 UTC
I'm not sure what the action would be on our end.
There's also not a clear reproducer.
Please reopen this bug if you have the reproducer and think the root cause is in dnf.

Comment 9 Panu Matilainen 2021-03-15 13:49:08 UTC
> Can it be the creation of the link to libacl.so.1.1.230 that is delayed?
>
> This link is create after the error:
>
> lrwxrwxrwx. 1 root root    17 Mar 13 11:45 /usr/lib64/libacl.so.1 -> libacl.so.1.1.230
> Access: 2021-03-13 11:45:26.083672539 +0100

The link is owned by the package, so it will have been created at the time of the install along with the other files of the package. However, something may have removed it in the meanwhile.

That something likely being ldconfig, which I would *expect* to be confused about the symlink versions here, because to consider libacl.so.1.1.2253 (from acl 2.2.53) *older* than libacl.so.1.1.230 (from 2.3.0) would be just strange. So, I believe ldconfig is getting run by some package's scriptlets, which ends up changing the so.1 symlink to the package from the old file, which will have been removed by the time %posttrans runs. And then it gets recreated when ldconfig runs as a transfiletrigger afterwards.

This is a bug in acl (and attr if it's the same way), library versions need to be sane for stuff to work.

Comment 10 Panu Matilainen 2021-03-15 13:53:13 UTC
Quoting https://www.gnu.org/software/libtool/manual/libtool.html#Updating-version-info:

> Never try to set the interface numbers so that they correspond to the release number of your package. This is an abuse that only fosters misunderstanding of the purpose of library versions.

Comment 11 Kamil Dudka 2021-03-15 14:10:25 UTC
Thank you for pointing it out!  That was my suspicion as well but I was not able to reproduce it myself.  The problem was that I was emulating the upgrade steps by rpm(1), which itself ran ldconfig after each step (unlike dnf).

I will ask upstream to improve the library versioning.

Comment 12 Kamil Dudka 2021-03-15 14:38:04 UTC
reported upstream: https://lists.nongnu.org/archive/html/acl-devel/2021-03/msg00002.html

Comment 13 Panu Matilainen 2021-03-15 14:41:38 UTC
Thanks!

FWIW, rpm/dnf makes no difference as such. For this to happen there needs to be some package (possibly entirely unrelated) calling ldconfig directly in the same transaction, in the time after the new acl package has been installed but before the old one is removed. Of course larger/complex transactions are easier to create with dnf.

Comment 14 Kamil Dudka 2021-03-15 14:50:15 UTC
Although it is not the cause of this bug report, I think there is also a room for improvement in rpm itself (or maybe rpm-build).  I discovered that rpm installs the symlink before the actual binary file, which is hardly useful.  If anything else tries to load libacl.so.1 in between these rename operations, it will unnecessarily fail as well:

# strace -e rename rpm -i ./libacl-2.3.0-1.fc34.x86_64.rpm --force
rename("/usr/lib/.build-id/70/94938cb34f623676283e98762b40867adba416;604f7324", "/usr/lib/.build-id/70/94938cb34f623676283e98762b40867adba416") = 0
rename("/usr/lib64/libacl.so.1;604f7324", "/usr/lib64/libacl.so.1") = 0
rename("/usr/lib64/libacl.so.1.1.230;604f7324", "/usr/lib64/libacl.so.1.1.230") = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6073, si_uid=0, si_status=0, si_utime=0, si_stime=1} ---
+++ exited with 0 +++

Comment 15 Fedora Update System 2021-03-16 08:04:35 UTC
FEDORA-2021-97d1d801e4 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-97d1d801e4

Comment 16 Fedora Update System 2021-03-16 14:44:13 UTC
FEDORA-2021-97d1d801e4 has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-97d1d801e4`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-97d1d801e4

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Panu Matilainen 2021-03-19 11:58:38 UTC
(In reply to Kamil Dudka from comment #14)
> Although it is not the cause of this bug report, I think there is also a
> room for improvement in rpm itself (or maybe rpm-build).  I discovered that
> rpm installs the symlink before the actual binary file, which is hardly
> useful.  If anything else tries to load libacl.so.1 in between these rename
> operations, it will unnecessarily fail as well:

Right - rpm doesn't try to order symlinks in any special way, while in this case it seems like the right thing to do, that's not always so straightforward (think of symlinks to directories etc). When a transaction is in progress, the system is in an inconsistent state in any number of ways and weird things can happen, there's no way around that. Future rpm versions will in fact limit the time of per-package inconsistency considerably, but until we have transactional filesystems that's just about the best it can do.

Comment 18 Kamil Dudka 2021-03-21 13:57:21 UTC
I do not understand how symlinks to directories are different.  As long as the symlinks do not form a cycle, it is IMO always better to install the target before the symlink itself.

Comment 19 Fedora Update System 2021-04-01 00:51:35 UTC
FEDORA-2021-97d1d801e4 has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.