RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2049421 - nm-online doesn't wait for internal dnsmasq instances to be functional before considering the network is up
Summary: nm-online doesn't wait for internal dnsmasq instances to be functional before...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: NetworkManager
Version: 8.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Thomas Haller
QA Contact: Filip Pokryvka
URL:
Whiteboard:
Depends On:
Blocks: 2090344
TreeView+ depends on / blocked
 
Reported: 2022-02-02 08:35 UTC by Renaud Métrich
Modified: 2022-11-08 11:14 UTC (History)
8 users (show)

Fixed In Version: NetworkManager-1.39.2-1.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-08 10:07:32 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-110661 0 None None None 2022-02-02 08:44:28 UTC
Red Hat Knowledge Base (Solution) 4879291 0 None None None 2022-02-07 07:26:36 UTC
Red Hat Product Errata RHBA-2022:7680 0 None None None 2022-11-08 10:08:14 UTC
freedesktop.org Gitlab NetworkManager NetworkManager-ci merge_requests 1069 0 None opened general: add nm_online_wait_for_dnsmasq test 2022-06-08 13:22:23 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1189 0 None opened [th/dns-update-pending-rh2049421] avoid startup-complete and device activation while DNS update pending 2022-04-12 12:19:56 UTC

Description Renaud Métrich 2022-02-02 08:35:45 UTC
Description of problem:

When NetworkManager is configured with multiple internal dnsmasq instances, initializing the dnsmasq instances takes some time but nm-online doesn't consider the services at all, which leads to having a race condition where the remote mounts are fired while DNS is not yet functional.
In KCS https://access.redhat.com/solutions/4879291, I propose a solution consisting in pinging the DNS servers present in /etc/resolv.conf, but this doesn't work in this particular case because the DNS server is set to 127.0.0.1 by NetworkManager, which is reachable and doesn't mean dnsmasq services are yet available.

Hence we really need to have something in NetworkManager that guarantees DNS are functional.

Ideally this should not be part of NetworkManager-wait-online which is used to reach network-online.target.
Instead , I see two alternatives:
1. having a new service, e.g. NetworkManager-dns-online, which would be used to reach nss-lookup.target
or
2. having a "NetworkManager DNS generator" systemd generator that would create runtime systemd services, one per dnsmasq instance, that would be in charge of making sure the dnsmasq instance is operational before reaching nss-lookup.target

The advantage of solution 2 is it's easily extensible from systemd's perspective.
The drawback of solution 2 is generating runtime services would require reloading systemd configuration afterwards.


Version-Release number of selected component (if applicable):

NetworkManager-1.32.10-4.el8.x86_64


How reproducible:

Always on customer setup


Steps to Reproduce:
1. Have multiple dnsmasq instances in /etc/NetworkManager/dnsmasq.d forwarding to some DNS servers but also defining their own records
2. Set up the node to be a NFS server exporting paths to some specific systems defining using a name

Actual results:

Fail to perform NFS exports at boot defined by names

Expected results:

DNS functional at time network-online.target is reached

Comment 1 Thomas Haller 2022-02-02 08:55:18 UTC
> Have multiple dnsmasq instances in /etc/NetworkManager/dnsmasq.d

What means to have multiple dnsmasq instances? I presume, you are configuring `[main].dns=dnsmasq` in NetworkManager.conf in your case. But there is only one dnsmasq instance.



How long NM-w-o blocks, strongly depends on your configuration.

Could you please provide a complete level=TRACE log of a boot that shows this problem, to see what exactly is happening.
See https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/main/contrib/fedora/rpm/NetworkManager.conf#L27 for info about logging.

Comment 2 Renaud Métrich 2022-02-02 09:05:21 UTC
Actually yes, having a single dnsmasq instance is there, sorry for the confusion.
I'm requesting the log with tracing enabled.

Comment 5 Thomas Haller 2022-02-04 22:23:34 UTC
Renaud, yes, that's right. A bug :)

All information present. Thank you.

(this needs fixing).

Comment 12 errata-xmlrpc 2022-11-08 10:07:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7680


Note You need to log in before you can comment on or make changes to this bug.