Bug 2185880
| Summary: | Missing permissions in dnsmasq to chown PID file (with [main].dns=dnsmasq mode in NetworkManager) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Marko Myllynen <myllynen> |
| Component: | NetworkManager | Assignee: | NetworkManager Development Team <nm-team> |
| Status: | CLOSED NOTABUG | QA Contact: | Desktop QE <desktop-qa-list> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.1 | CC: | bgalvani, lrintel, rkhan, sfaye, sukulkar, thaller, till |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-04-28 07:35:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Marko Myllynen
2023-04-11 13:43:27 UTC
the PID file is created by dnsmasq (NetworkManager only might read it and unlink() it).
Also, when NetworkManager spawns dnsmasq, it does so "directly". So dnsmasq should also run as root:root.
However, NetworkManager.service does not have CAP_CHOWN (via CapabilityBoundingSet=; see `systemctl cat NetworkManager.service`).
dnsmasq does:
»······· /* We're still running as root here. Change the ownership of the PID file
»·······»······· to the user we will be running as. Note that this is not to allow
»·······»······· us to delete the file, since that depends on the permissions·
»·······»······· of the directory containing the file. That directory will
»·······»······· need to by owned by the dnsmasq user, and the ownership of the
»·······»······· file has to match, to keep systemd >273 happy. */
»······· if (getuid() == 0 && ent_pw && ent_pw->pw_uid != 0 && fchown(fd, ent_pw->pw_uid, ent_pw->pw_gid) == -1)
»·······»·······chown_warn = errno;
so it tries to change the owner, but expectedly fails.
Marko: it seems the warning is mostly harmless (aside that warnings by themselves scare the user and are always undesirable). Or what are the actual problems that this causes?
For dnsmasq to have that permission, NM would need to get that capability. We would rather avoid granting more capabilities to NetworkManager.
The proper solution would be to not run dnsmasq as a child process of NetworkManager, but instead as a service spawned differently (for example, via a separate systemd service or via nm-priv-helper service which has all permissions). While that would be great (for multiple reasons), it's also significant effort for a non-default configuration of NetworkManager (dns=dnsmasq mode in NetworkManager has it's own set of problems).
An alternative solution might be that dnsmasq just silently accepts the fact that it has no permissions. SELinux or lacking Capabilities can cause an EPERM error for chmod. It seems wrong to print a warning about that. Is that really a useful information for the user? What's the user expected to do about that? How about:
if (chown_warn != 0)
my_syslog(chown_warn == EPERM ? LOG_DEBUG : LOG_WARNING, "chown of PID file %s failed: %s", daemon->runfile, strerror(chown_warn));
Thanks for looking into this. I started looking the logs in more detail when I came across this dnsmasq crash when used for DNS caching: https://bugzilla.redhat.com/show_bug.cgi?id=2185878 My guess is that the warning has nothing to do with the crash. However, I know that when people less familiar with these components investigate issues they might get sidetracked or stuck with these sorts of messages and hours if not days of troubleshooting time could be lost due to such red herrings. And here if the crash is unrelated then this has no functional impact at all AFAICS. I think your suggested solution might be the best option here since if the user can't do anything about something and things work then why print such message? > dns=dnsmasq mode in NetworkManager has it's own set of problems Is there somewhere where I could learn more about this, do you not recommend using this mode at all? We have RHKB articles suggesting this approach so I'm under the impression this should be at least somewhat decent solution to implement local DNS caching. Thanks. OFFTOPIC: > > dns=dnsmasq mode in NetworkManager has it's own set of problems > > Is there somewhere where I could learn more about this, do you not recommend using this mode at all? We have RHKB articles suggesting this approach so I'm under the impression this should be at least somewhat decent solution to implement local DNS caching. To be clear: if it works for you, then great! Yes, it's one of two ways to get DNS caching and split DNS with NetworkManager (along dns=systemd-resolved). Pros: 1a) it's a simple way to get split DNS and a local caching server. 2a) compared to systemd-resolved, systemd-resolved gets criticized for the way it does DNSSec and udp:53 (non-conforming?). To some people that seems to be a major problem, and NetworkManager's dns=dnsmasq may(??) do better here. Cons: 1b) the lifetime of dnsmasq is tied to NetworkManager. That causes subtle problems when restarting NetworkManager (which users probably do seldom only) and NetworkManager needs capabilities only to inherit to its child process (CAP_CHMOD in this case). This should be fixed by running dnsmasq as a system service, with separate sandboxing and service lifetime. 2b) when NM runs dnsmasq, no other components can inject dynamic DNS settings. For example, WireGuard's wg-quick also wants to setup DNS (as it runs parallel to NM). Multiple components can cooperate by using resolvconf (https://en.wikipedia.org/wiki/Resolvconf, which doesn't exist on Fedora/RHEL) or systemd-resolved. Both provide an API so multiple components can cooperate to configure DNS. NetworkManager is not a replacement for resolveconf and provides no such API. Instead, the idea is that NetworkManager can itself either use rc-manager=resolvconf or dns=systemd-resolved and integrate that way with other components. With NetworkManager's dns=dnsmasq, that is not possible. This may be no problem on your personal machine. It's a problem for a distribution default, where we also want to ship e.g. wg-quick. On the other hand, on a distribution we may want a local caching DNS or split DNS by default. We also would rather focus on getting *one* solution work well, instead of spreading our efforts (patch welcome though). This is only my opinion, but I would rather use dns=systemd-resolved, which is also the default on Fedora and Ubuntu (caveat 2a). dnsmasq is however great as a stand-alone DNS+DHCP server and also used by NetworkManager's `ipv4.method=shared`! Thanks for the additional color, very helpful and much appreciated! One notable aspect in the RHEL world is that systemd-resolved is still in Tech Preview but given your explanation perhaps that would be the preferred and supported option one day. Closing this bz after discussion as NOTABUG. |