Bug 2033020 - glibc: Reconsider "dns [!UNAVAIL=return] files" default for hosts database
Summary: glibc: Reconsider "dns [!UNAVAIL=return] files" default for hosts database
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-15 17:49 UTC by Martin Pitt
Modified: 2023-09-29 01:34 UTC (History)
14 users (show)

Fixed In Version: glibc-2.34.9000-33.fc36 glibc-2.34-17.fc35
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-02-08 14:21:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Sourceware 28700 0 P2 NEW "dns [!UNAVAIL=return] files" default for hosts database is not useful 2021-12-15 18:03:54 UTC

Description Martin Pitt 2021-12-15 17:49:36 UTC
Description of problem: A few days ago, our rawhide tests on the Testing Farm (packit) started to fail on resolving "localhost" (Error: getaddrinfo ENOTFOUND localhost).

What happens is roughly that an instance is booted from the current Fedora rawhide cloud image, then it gets packages upgraded (to ensure tests run against the latest versions). This looks [1] like this:

16:38:38             Run command 'ssh -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oIdentitiesOnly=yes -i /etc/citool.d/id_rsa_artemis root.20.221 export PACKIT_UPSTREAM_NAME=starter-kit PACKIT_DOWNSTREAM_NAME=starter-kit PACKIT_DOWNSTREAM_URL=https://src.fedoraproject.org/rpms/starter-kit.git PACKIT_FULL_REPO_NAME=cockpit-project/starter-kit PACKIT_COMMIT_SHA=bad60b4bd5dae44f8014603ed4864fabca869f8b PACKIT_PACKAGE_NAME=starter-kit PACKIT_BUILD_LOG_URL=https://copr-be.cloud.fedoraproject.org/results/packit/cockpit-project-starter-kit-520/fedora-rawhide-x86_64/03050742-cockpit-starter-kit/builder-live.log.gz PACKIT_SRPM_URL=https://download.copr.fedorainfracloud.org/results/packit/cockpit-project-starter-kit-520/srpm-builds/03050742/cockpit-starter-kit-1-1.20211215113015495672.pr520.fc34.src.rpm PACKIT_PACKAGE_NVR=cockpit-starter-kit-1-1.20211215113015495672.pr520.fc36; rpm -q --whatprovides "cockpit-starter-kit" "make" "glibc-langpack-de" "libvirt-python3" "npm" "bzip2" "cockpit-system" "python3" "cockpit-ws" "git-core" || dnf install -y "cockpit-starter-kit" "make" "glibc-langpack-de" "libvirt-python3" "npm" "bzip2" "cockpit-system" "python3" "cockpit-ws" "git-core"'.
16:38:38             environment: None
[...]
16:38:42             out: ========================================================================================
16:38:42             out:  Package             Arch    Version                  Repository                    Size
16:38:42             out: ========================================================================================
16:38:42             out: Installing:
16:38:42             out:  bzip2               x86_64  1.0.8-10.fc36            testing-farm-tag-repository   52 k
16:38:42             out:  cockpit-ws          x86_64  259-1.fc36               testing-farm-tag-repository  1.3 M
16:38:42             out:  git-core            x86_64  2.33.1-2.fc36            testing-farm-tag-repository  3.8 M
16:38:42             out:  glibc-langpack-de   x86_64  2.34.9000-31.fc36        testing-farm-tag-repository  687 k
16:38:42             out:  make                x86_64  1:4.3-6.fc35             testing-farm-tag-repository  533 k
16:38:42             out:  npm                 x86_64  1:8.1.2-1.16.13.1.1.fc36 testing-farm-tag-repository  1.7 M
16:38:42             out:  python3-libvirt     x86_64  7.9.0-1.fc36             testing-farm-tag-repository  316 k
16:38:42             out: Upgrading:
16:38:42             out:  glibc               x86_64  2.34.9000-31.fc36        testing-farm-tag-repository  2.0 M
16:38:42             out:  glibc-common        x86_64  2.34.9000-31.fc36        testing-farm-tag-repository  418 k
16:38:42             out:  glibc-gconv-extra   x86_64  2.34.9000-31.fc36        testing-farm-tag-repository  1.6 M
16:38:42             out:  glibc-langpack-en   x86_64  2.34.9000-31.fc36        testing-farm-tag-repository  698 k
16:38:42             out: Installing dependencies:
16:38:42             out:  cyrus-sasl          x86_64  2.1.27-16.fc36           testing-farm-tag-repository   71 k
16:38:42             out:  cyrus-sasl-gssapi   x86_64  2.1.27-16.fc36           testing-farm-tag-repository   26 k
16:38:42             out:  gc                  x86_64  8.0.6-1.fc36             testing-farm-tag-repository  103 k
16:38:42             out:  guile22             x86_64  2.2.7-3.fc35             testing-farm-tag-repository  6.4 M
16:38:42             out:  libssh2             x86_64  1.10.0-2.fc36            testing-farm-tag-repository  117 k
16:38:42             out:  libtool-ltdl        x86_64  2.4.6-45.fc36            testing-farm-tag-repository   36 k
16:38:42             out:  libuv               x86_64  1:1.42.0-2.fc36          testing-farm-tag-repository  148 k
16:38:42             out:  libvirt-libs        x86_64  7.10.0-1.fc36            testing-farm-tag-repository  4.5 M
16:38:42             out:  libwsman1           x86_64  2.7.1-1.fc36             testing-farm-tag-repository  139 k
16:38:42             out:  nodejs              x86_64  1:16.13.1-1.fc36         testing-farm-tag-repository  199 k
16:38:42             out:  nodejs-libs         x86_64  1:16.13.1-1.fc36         testing-farm-tag-repository   14 M
16:38:42             out:  numactl-libs        x86_64  2.0.14-4.fc35            testing-farm-tag-repository   30 k
16:38:42             out:  yajl                x86_64  2.1.0-17.fc35            testing-farm-tag-repository   37 k
16:38:42             out: Installing weak dependencies:
16:38:42             out:  fedora-logos        noarch  35.0.0-2.fc36            testing-farm-tag-repository  1.3 M
16:38:42             out:  nodejs-docs         noarch  1:16.13.1-1.fc36         testing-farm-tag-repository  6.6 M
16:38:42             out:  nodejs-full-i18n    x86_64  1:16.13.1-1.fc36         testing-farm-tag-repository  7.8 M
16:38:42             out:  sscg                x86_64  3.0.1-1.fc36             testing-farm-tag-repository   45 k
[...]
16:39:16             out:   Cleanup          : glibc-2.34.9000-27.fc36.x86_64                       29/32 
16:39:16             out: warning: /etc/nsswitch.conf saved as /etc/nsswitch.conf.rpmsave


After that, there is no /etc/nsswitch.conf any more, and commands like "getent ahosts local" and "getent hosts localhost" fail.


This is almost certainly fallout from https://src.fedoraproject.org/rpms/glibc/c/cadee80b1316bc264db044180e20dc1e671ed1ea

These instances do *not* have authselect installed. If glibc now assumes that authselect takes over the ownership, it must grow a Requires: authselect, or it must not remove the file on upgrades.

Version-Release number of selected component (if applicable):

glibc 2.34.9000-31.fc36

This always happens on fedora-rawhide packit tests now, but I have only quickly tried to reproduce it with a simple "dnf update", and that did not remove the file. So the conditions under which this happens are a bit more subtle apparently. I'll try to find a reproducer tomorrow, but the cause already seems quite clear?

Additional info:

[1] http://artifacts.dev.testing-farm.io/f4046164-3cb6-4662-89b2-b8e47b39b1cb//work-allJOfFfz/log.txt

Comment 1 Martin Pitt 2021-12-15 17:50:50 UTC
See https://github.com/cockpit-project/starter-kit/pull/520 for my initial notes and investigations. Adding Miroslav to CC, as he was debugging that with me.

Comment 2 Florian Weimer 2021-12-15 18:01:16 UTC
dns is searched first and provides an answer for "localhost" (NXDOMAIN). We never look at the files database as a result, so the contents of /etc/hosts is ignored. This needs a bit of upstream discussion.

How urgent is it to fix this? Thanks.

Comment 3 Martin Pitt 2021-12-15 18:07:36 UTC
It seems to me that the dependencies are not set up correctly -- the new glibc does not require the new pam-1.5.2-8.fc36, thus authselect is never installed. So adding a "Requires: pam >= 1.5.2-8", or an inverse Conflicts: (does rpm have that?) should fix this?

Comment 4 Martin Pitt 2021-12-15 18:09:26 UTC
I wouldn't make this bug about the builtin defaults -- this is about a broken upgrade path for properly transitioning the ownership of nsswitch.conf. Just removing nsswitch would also break libnss_systemd, sssd, and other configs.

Comment 5 Florian Weimer 2021-12-15 18:14:52 UTC
(In reply to Martin Pitt from comment #4)
> I wouldn't make this bug about the builtin defaults -- this is about a
> broken upgrade path for properly transitioning the ownership of
> nsswitch.conf. Just removing nsswitch would also break libnss_systemd, sssd,
> and other configs.

I'm not sure if we can add a dependency on pam to glibc. This kind of dependency cycle probably has a lot of unwanted side effects. There must be a better way to pull in authselect. On the other hand, it could be a one of these cases where it is just very hard to support partial rawhide upgrades.

(I definitely want to change the glibc default.)

Comment 6 Martin Pitt 2021-12-15 21:32:43 UTC
> I'm not sure if we can add a dependency on pam to glibc.

Right, hence my suggestion to have "Conflicts: pam < 1.5.2-8".

> I definitely want to change the glibc default.

Yes, fully agreed -- being able to resolve "localhost" without /etc/nsswitch.conf and /etc/hosts would certainly be nice. Right now one needs libnss-systemd's "myhostname" for that. But right now one has to pick between "get rid of /etc/hosts" (and enable myhostname in nsswitch) and "get rid of nsswitch" (and then having to keep /etc/hosts). So much legacy :'-(

I just thought that fixing *this* upgrade issue is a simple Conflicts: addition, and that changing the builtin defaults is much more discussion/work/risk.

Comment 7 Florian Weimer 2021-12-20 10:43:47 UTC
(In reply to Martin Pitt from comment #6)
> > I'm not sure if we can add a dependency on pam to glibc.
> 
> Right, hence my suggestion to have "Conflicts: pam < 1.5.2-8".

The loop is still there. RPM will not know whether to update pam first or glibc if the new pam requires the new glibc. For example, pam requires libc.so.6(GLIBC_2.34)(64bit), so if you are upgrading from pre-2.34 glibc, it's not installable until glibc is upgraded.

This has to be handled in a different way.

> > I definitely want to change the glibc default.
> 
> Yes, fully agreed -- being able to resolve "localhost" without
> /etc/nsswitch.conf and /etc/hosts would certainly be nice. Right now one
> needs libnss-systemd's "myhostname" for that. But right now one has to pick
> between "get rid of /etc/hosts" (and enable myhostname in nsswitch) and "get
> rid of nsswitch" (and then having to keep /etc/hosts). So much legacy :'-(

/etc/hosts will still be needed for a while, for the localhost definition.

> I just thought that fixing *this* upgrade issue is a simple Conflicts:
> addition, and that changing the builtin defaults is much more
> discussion/work/risk.

I don't think it's *that* simple because of the potential for a dependency loop, sorry.

Comment 8 Martin Pitt 2021-12-20 13:27:29 UTC
> The loop is still there. RPM will not know whether to update pam first or
> glibc if the new pam requires the new glibc.

I see -- that'd be a case for dpkg's "Breaks:", but I don't think rpm has that.

> This has to be handled in a different way.

OK, too bad -- by now it has hopefully also kind of fixed itself by Fedora mirrors catching up.

> > But right now one has to pick
> > between "get rid of /etc/hosts" (and enable myhostname in nsswitch) and "get
> > rid of nsswitch" (and then having to keep /etc/hosts). So much legacy :'-(
> 
> /etc/hosts will still be needed for a while, for the localhost definition.

Not really, "myhostname" takes care of that. I haven't had one for a while, and it'd be nice for distros to stop creating it. But that's of course a tangent.

> I don't think it's *that* simple because of the potential for a dependency
> loop, sorry.

OK, then I supose the only way is to declare the broken dependencies as "wontfix", and not supporting partial upgrades (FWIW, it's not like I was actively trying to do a partial upgrade, dnf just happened to do that when installing packages -- but again, hopefully fixed now with a fresh set of cloud images).

Thanks!

Comment 11 Fedora Update System 2022-01-24 21:07:38 UTC
FEDORA-2022-9421366d9c has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-9421366d9c

Comment 12 Fedora Update System 2022-01-25 02:10:50 UTC
FEDORA-2022-9421366d9c has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-9421366d9c`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-9421366d9c

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 13 Fedora Update System 2023-09-29 01:33:52 UTC
FEDORA-2023-93246fc470 has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2023-93246fc470

Comment 14 Fedora Update System 2023-09-29 01:34:44 UTC
FEDORA-2023-93246fc470 has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.