Bug 1652867
Summary: | glibc: Avoid the need for manually running ldconfig after downgrade | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Vít Ondruch <vondruch> |
Component: | glibc | Assignee: | Florian Weimer <fweimer> |
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | arjun.is, asosedki, codonell, dj, dominik, fweimer, igor.raits, jonemilj, law, mfabian, nicolas.mailhot, pfrankli, rth, siddhesh, sipoyare, udovdh |
Target Milestone: | --- | Keywords: | FutureFeature, Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glibc-2.33.9000-15.fc35 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-05 10:41:40 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Vít Ondruch
2018-11-23 10:53:48 UTC
This is the PR which introduced this triggers: https://src.fedoraproject.org/rpms/glibc/pull-request/8 Interesting. It seems that the failure happens between glibc-2.28.9000-* and glibc-2.28-5.fc30, which was the latest stable 2.28 release. So it might be something different then Lua vs Shell. IOW this is fine: glibc-2.28.9000-15.fc30.x86_64 => glibc-2.28.9000-1.fc30.x86_64 While this fails: glibc-2.28.9000-15.fc30.x86_64 => glibc-2.28-5.fc30.x86_64 glibc-2.28.9000-1.fc30.x86_64 => glibc-2.28-5.fc30.x86_64 (In reply to Vít Ondruch from comment #0) > Additional info: > It seems that Lua used to be used for the triggers [1]. Not sure if there > were some different issues triggering this change, but in the context of > this issue, it was the better option and the commit should be reverted. I don't think this will completely solve the issue. We need to figure out in which order RPM removes files and updates symbolic links and find a way to work around breakage that results from this. Grepping for ld-linux|ld-2.28|execve.*/bin/sh during the downgrade shows this: 93186:25 openat(AT_FDCWD, "/lib64/ld-2.28.so;5bf7e290", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC, 0666) = 44 93228:25 chown("/lib64/ld-2.28.so;5bf7e290", 0, 0) = 0 93229:25 chmod("/lib64/ld-2.28.so;5bf7e290", 0755) = 0 93231:25 utimensat(AT_FDCWD, "/lib64/ld-2.28.so;5bf7e290", [{tv_sec=1534319990, tv_nsec=0} /* 2018-08-15T09:59:50+0200 */, {tv_sec=1534319990, tv_nsec=0} /* 2018-08-15T09:59:50+0200 */], AT_SYMLINK_NOFOLLOW) = 0 93232:25 lstat("/lib64/ld-2.28.so", 0x7ffdb3b42440) = -1 ENOENT (No such file or directory) 93233:25 rename("/lib64/ld-2.28.so;5bf7e290", "/lib64/ld-2.28.so") = 0 93234:25 symlink("ld-2.28.so", "/lib64/ld-linux-x86-64.so.2;5bf7e290") = 0 93236:25 lchown("/lib64/ld-linux-x86-64.so.2;5bf7e290", 0, 0) = 0 93237:25 utimensat(AT_FDCWD, "/lib64/ld-linux-x86-64.so.2;5bf7e290", [{tv_sec=1534319883, tv_nsec=0} /* 2018-08-15T09:58:03+0200 */, {tv_sec=1534319883, tv_nsec=0} /* 2018-08-15T09:58:03+0200 */], AT_SYMLINK_NOFOLLOW) = 0 93242:25 lstat("/lib64/ld-linux-x86-64.so.2", {st_mode=S_IFLNK|0777, st_size=15, ...}) = 0 93243:25 rename("/lib64/ld-linux-x86-64.so.2;5bf7e290", "/lib64/ld-linux-x86-64.so.2") = 0 98327:25 symlink("../../../../lib64/ld-2.28.so", "/usr/lib/.build-id/d2/8755c775e07f0160005830f64d32bb93ea1ff0;5bf7e290") = 0 105825:28 lstat("/lib64/ld-2.28.9000.so", {st_mode=S_IFREG|0755, st_size=247536, ...}) = 0 105828:28 lstat("/lib64/ld-linux-x86-64.so.2", {st_mode=S_IFLNK|0777, st_size=10, ...}) = 0 105829:28 stat("/lib64/ld-linux-x86-64.so.2", {st_mode=S_IFREG|0755, st_size=228072, ...}) = 0 105830:28 openat(AT_FDCWD, "/lib64/ld-linux-x86-64.so.2", O_RDONLY) = 4 106309:28 lstat("/lib64/ld-2.28.so", {st_mode=S_IFREG|0755, st_size=228072, ...}) = 0 106669:28 stat("/lib64/ld-linux-x86-64.so.2", {st_mode=S_IFREG|0755, st_size=228072, ...}) = 0 106670:28 stat("/lib64/ld-2.28.9000.so", {st_mode=S_IFREG|0755, st_size=247536, ...}) = 0 106671:28 lstat("/lib64/ld-linux-x86-64.so.2", {st_mode=S_IFLNK|0777, st_size=10, ...}) = 0 106672:28 unlink("/lib64/ld-linux-x86-64.so.2") = 0 106673:28 symlink("ld-2.28.9000.so", "/lib64/ld-linux-x86-64.so.2") = 0 123359:30 execve("/bin/sh", ["/bin/sh", "/var/tmp/rpm-tmp.0fnUhD", "1"], 0x7ffdb3b44540 /* 21 vars */) = 0 133119:31 execve("/bin/sh", ["/bin/sh", "/var/tmp/rpm-tmp.XzSbz3", "1"], 0x7ffdb3b44540 /* 21 vars */) = 0 138675:25 lstat("/lib64/ld-linux-x86-64.so.2", {st_mode=S_IFLNK|0777, st_size=15, ...}) = 0 138680:25 lstat("/lib64/ld-2.28.9000.so", {st_mode=S_IFREG|0755, st_size=247536, ...}) = 0 138681:25 lstat("/lib64/ld-2.28.9000.so", {st_mode=S_IFREG|0755, st_size=247536, ...}) = 0 138682:25 lstat("/lib64/ld-2.28.9000.so", {st_mode=S_IFREG|0755, st_size=247536, ...}) = 0 138683:25 removexattr("/lib64/ld-2.28.9000.so", "security.capability") = -1 ENODATA (No data available) 138684:25 unlink("/lib64/ld-2.28.9000.so") = 0 146272:34 execve("/bin/sh", ["/bin/sh", "/var/tmp/rpm-tmp.7iz4cE", "0", "0"], 0x7ffdb3b44540 /* 21 vars */) = -1 ENOENT (No such file or directory) 146393:35 execve("/bin/sh", ["/bin/sh", "/var/tmp/rpm-tmp.aNyBSe", "0", "0"], 0x7ffdb3b44540 /* 21 vars */) = -1 ENOENT (No such file or directory) 146631:36 execve("/bin/sh", ["/bin/sh", "/var/tmp/rpm-tmp.0uETzP", "0", "0"], 0x7ffdb3b44540 /* 21 vars */ <unfinished ...> What RPM is doing here isn't helpful at all. On line 106673, it restores the ld.so symbolic link to the old (pre-downgrade) version, and then proceeds to delete that ld.so version on line 138684. (This is with rpm-4.14.2.1-3.fc30.x86_64 in the chroot.) Solving this completely is not easy because RPM performs some file system updates very late in the transaction. Files deleted during the transaction stay around for a long time, and if ldconfig is executed by a scriptlet, it will break things during glibc downgrades. This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to '31'. This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to 31. The analysis in comment 3 is slightly off. It's ldconfig that restores the symbolic links because the old files (from the to-be-erased package) are still around when it runs. In order to fix this, we would have to get rid of symbolic links and use paths like /lib64/libc.so.6 directly. This is something that requires a few upstream build system changes, but is probably generally useful. *** Bug 1636593 has been marked as a duplicate of this bug. *** Patches posted upstream: https://sourceware.org/ml/libc-alpha/2019-11/msg00971.html Note that ppc64le and s390x have ld.so.1 as a name for the dynamic interpreter and without symlinks this doesn't match any of the searches used by ldconfig to find ld.so as a shared library that you can link against. This will need some additional code upstream to handle this. This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. Still totally a thing, just lost a system to this bug going down from glibc-2.33-8.fc34 to glibc-2.32-4.fc33. *** Bug 1965289 has been marked as a duplicate of this bug. *** I.e.: also happens on Fedora 34 when going back from fedora 35 glibc to Fedora 34 glibc. Is there a (manual) workaround to make the downgraded system work? (i.e.: boot fedora Live DVD and fix whatever..) Just running ldconfig should fix things. You need to follow the usual steps for system recovery (get a writable root file system before, and trigger an SELinux relabel afterwards). Thanks! yes, `rpm -Uvh --force --nodeps glibc* libnsl-2.33-8.fc34.x86_64.rpm nscd-2.33-8.fc34.x86_64.rpm ; ldconfig -v` worked. Changes are in dist-git, incorporating a proposed upstream patch before upstream review. There is a new warning during updates: /usr/sbin/ldconfig: /lib64/ld-linux-x86-64.so.2 is not a symbolic link I think it's harmless. I think it stems from processing a leftover ld-2.33.so file which has not been removed by RPM yet. Its soname is ld-linux-x86-64.so.2, so ldconfig schedules the creation a symbolic link, but realizes later that this file already exists, and does nothing instead (except printing this warning). It's *so* close to not working, but this time, it looks like we are lucky and it actually works. FEDORA-2021-9ce0f65a09 has been pushed to the Fedora 35 stable repository. If problem still persists, please make note of it in this bug report. This has caused ld.so to move from /usr/lib64 to /usr/lib on aarch64 and s390x and broken lddtree.py test of pax-utils (https://koschei.fedoraproject.org/build/10505661). Was that intentional? $ rpmdiff glibc-2.33.9000-14.fc35.aarch64.rpm glibc-2.33.9000-15.fc35.aarch64.rpm ... SM5.......T /lib/ld-linux-aarch64.so.1 removed /lib64/ld-2.33.9000.so removed /lib64/ld-linux-aarch64.so.1 $ rpm -q glibc glibc-2.33.9000-15.fc35.aarch64 $ ls -l /lib64/ld* ls: cannot access '/lib64/ld*': No such file or directory $ ls -l /lib/ld* -rwxr-xr-x. 1 root root 814848 Jun 15 14:50 /lib/ld-linux-aarch64.so.1 It seems counter-intuitive to have ld.so in /lib while the rest of the libraries are in /lib64 on a 64-bit arch. (In reply to Dominik 'Rathann' Mierzejewski from comment #20) > This has caused ld.so to move from /usr/lib64 to /usr/lib on aarch64 and > s390x and broken lddtree.py test of pax-utils > (https://koschei.fedoraproject.org/build/10505661). Was that intentional? Yes, it's required to work around the RPM issue. > It seems counter-intuitive to have ld.so in /lib while the rest of the > libraries are in /lib64 on a 64-bit arch. The path to the dynamic loader is mandated by the psABI supplement. (It is hard-coded into main programs.) We cannot change it. Test expectations will have to be adjusted accordingly. (In reply to Florian Weimer from comment #21) > (In reply to Dominik 'Rathann' Mierzejewski from comment #20) [...] > > It seems counter-intuitive to have ld.so in /lib while the rest of the > > libraries are in /lib64 on a 64-bit arch. > > The path to the dynamic loader is mandated by the psABI supplement. (It is > hard-coded into main programs.) We cannot change it. Could you point to the specific document saying that ld.so must be in /lib on aarch64? I found the aarch64 ABI documentation (https://github.com/ARM-software/abi-aa), but I'm unable to find this specific requirement. > Test expectations will have to be adjusted accordingly. Obviously. (In reply to Dominik 'Rathann' Mierzejewski from comment #22) > Could you point to the specific document saying that ld.so must be in /lib > on aarch64? > I found the aarch64 ABI documentation > (https://github.com/ARM-software/abi-aa), but > I'm unable to find this specific requirement. The PT_INTERP value must be consistent across distributions. Not all of them use /lib64 paths, which makes /lib a more natural choice in some ways. I believe the System V psABI supplement for AArch64 has not yet been published. There are various ELF-related AArch64 specifications, but they equally apply to embedded scenarios. They do not specify Linux-specific aspects such as the ELF interpreter name or the minimum and maximum page size. I'm only asking because you changed the location from /lib64 to /lib on aarch64 and s390x and Carlos only tested on i686 and x86_64 where you actually have both /lib and /lib64 due to multilib. This move was not explicitly mentioned, either. There may be other software out there that's expecting ld.so to be in /lib64 on 64-bit arches. The official name of the program interpreter was always located under /lib, and that didn't change. Software which expects the program interpreter to exist in /lib64 is already non-portable. For example, Debian does not have a /lib64 directory at all: https://packages.debian.org/buster/arm64/libc6/filelist I get the point. Thanks for the explanation. Shouldn't this bug be closed, by the way? (In reply to Dominik 'Rathann' Mierzejewski from comment #26) > I get the point. Thanks for the explanation. Shouldn't this bug be closed, > by the way? Indeed, closing. |