Bug 1422610
| Summary: | NM changes hostname to localhost.localdomain even though no devices are managed by it | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Edward Haas <edwardh> |
| Component: | NetworkManager | Assignee: | Francesco Giudici <fgiudici> |
| Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 7.3 | CC: | aloughla, atragler, bgalvani, bugs, danken, edwardh, fgiudici, lrintel, mburman, myakove, nsednev, rkhan, sukulkar, thaller, vbenes, ycui, ylavi |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | NetworkManager-1.8.0-0.4.rc1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1419906 | Environment: | |
| Last Closed: | 2017-08-01 09:22:07 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 132679, 1405275, 1419906 | ||
|
Description
Edward Haas
2017-02-15 16:48:40 UTC
Hi Edward, can you please add some details to the scenario to reproduce the bug? I tried to create a new connection in RHEL-7.3, restart network service but I was not able to see the issue. Anyway, I was able to find an issue with this scenario: 1) create ifcfg file with IPv4DHCP with NM_CONTROLLED=NO 2) configure dhcp server to provide hostname option 3) reboot Then I found the dhclient getting the lease, transient hostname was set accordingly and NM immediately changed it back to localhost.localdomain. I was able to trigger this only at boot anyway. If this is not the scenario you are addressing, please provide more information on how to reproduce it. Some more extensive log could be of help too. Thanks Francesco The issue is probably due to the fact that NetworkManager ignores external hostname changes after startup: it never checks if someone else has changed the hostname. When sooner or later something triggers NetworkManger to check / update the hostname (e.g., a connection is teared down or devices are set to unmanaged) it will just update the hostname as best as it can, leveraging its internal information only. In case all devices are unmanaged, it will try to restore the hostname that was in place when it started: if was unset, it will set just a default "localhost". So, proposed patch will let NetworkManager check if someone else has changed the hostname before trying to update it. It will still overwrite it but only if the hostname value comes from a DHCP hostname option in an active NM connection. Otherwise it will just skip the update. In both the cases, it will record the hostname as fallback for future updates. Please, review upstream branch: keep_externally_set_hostname-rh1422610 Can you review? : https://github.com/NetworkManager/NetworkManager/compare/keep_externally_set_hostname-rh1422610 Well, asking review mainly to NetworkManager developers. Any other review anyway much appreciated! Edward, if you need a build to test just ask. I'm anyway quite confident this should fix the issue. Thanks! (Moving bug to POST) Got review from Beniamino (thanks!), repushed upstream on branch: fg/keep_externally_set_hostname-rh1422610 Isn't this bug is a duplicate of 1419906? (In reply to Nikolai Sednev from comment #8) > Isn't this bug is a duplicate of 1419906? This bug is short and clean and is opened on the correct component. Your bug 1419906 should stay as it is, hopefully becoming a TestOnly bug. What is the status of this bug? (In reply to Yaniv Dary from comment #10) > What is the status of this bug? Is in progress. The provided patch aims to detect external changes to the hostname by NetworkManager. The problem is that there is a race between NM reading the hostname and then setting it and an external set. So there is no guarantee with this approach that NetworkManager will not override the hostname. While hostname management needs some deep rework (will be addressed in bug #1405275) I would like to bring here some solution that can be easily backported to 7.3. Two solutions could be viable: 1) Add a global option to disable (transient) hostname management in NM. 2) Disable hostname changes when there are no active connections. In the meanwhile the branch fg/keep_externally_set_hostname-rh1422610 got updated: it fixes a race in the get/set of the hostname inside NetworkManager itself on the current branch (#1405275 rework will solve this by design). The updated branch would probably work in many cases but still cannot guarantee that hostname is not changed by NetworkManager when externally set, due to the race (with the EXTERNAL setter) I mentioned above. WiP is to add to the branch option 1) OR 2). I'm currently looking at option 2 but is a bit more complex than option 1. Do you think option 1) could be viable too for this use case? The main issue is that NM is trying to manage a device that is not managed by NM. Can't we fix that? Have NM not control things that are managed by another service? (In reply to Yaniv Dary from comment #12) > The main issue is that NM is trying to manage a device that is not managed > by NM. NetworkManager manages the hostname and on some triggering events it will update it at the best of its knowledge. I think the issue was that the hostname was overwritten. If there is more, and NetworkManager manages devices it was told not to manage, please provide some logs... this was not clear at all till now. (In reply to Francesco Giudici from comment #13) > (In reply to Yaniv Dary from comment #12) > > The main issue is that NM is trying to manage a device that is not managed > > by NM. > > NetworkManager manages the hostname and on some triggering events it will > update it at the best of its knowledge. I think the issue was that the > hostname was overwritten. > If there is more, and NetworkManager manages devices it was told not to > manage, please provide some logs... this was not clear at all till now. I think that Yaniv Dary was actually asking for something similar to option 2, when there are no active connections. Yaniv, the hostname is a host level configuration and currently NM tries to control it, regardless of connection profiles (which are at the device level). The thing is, that it may not cover all the scenarios, as there may be active connections that have static settings (no DHCP), which should not interfere with the hostname. VDSM specifically, acquires from NM only the devices it wants to managed, not all. (In reply to Edward Haas from comment #14) > > The thing is, that it may not cover all the scenarios, as there may be > active connections that have static settings (no DHCP), which should not > interfere with the hostname. VDSM specifically, acquires from NM only the > devices it wants to managed, not all. Thanks Edward, I see. So, here the ideal solution seems to me to have NetworkManager updating the hostname only if it has an hostname option coming from DHCP. This could be done making NM hostname management globally configurable. I will work on this part first. What is the status of this bug? (In reply to Yaniv Dary from comment #16) > What is the status of this bug? Hi Yaniv, just updated the branch fg/keep_externally_set_hostname-rh1422610. Taken some time to do some basic tests on it. Here what has been put into: 1) detection of external hostname change this will prevent NM to overwrite the hostname with locahost if the hostname was set outside NM. This could not be granted anyway if the external set and the hostname update in NM happens at almost the same time (get and set of the hostname from NM cannot be atomic). 2) don't trigger hostname update when events are detected on unmanged devices 3) add a new global configure option that allows to change the hostname update behavior of NM. Three options are available: - none (NM will not update the hostname) - dhcp (NM will update the hostname only from options in DHCP) - full (full update policy, as before) Please, review the branch > policy: remove useless checks and redundant var when setting the hostname I don't understand this. Could you explain better in the commit message? > policy: detect if the hostname was changed outside NetworkManager + /* Cancel ongoing lookup first*/ + priv->lookup_cancellable = g_cancellable_new (); the comment seems wrong. Maybe also not move creation of lookup_cancellable before /* Check if the hostname was externally set */ ? > config: add new hostname_mode configuration nm_config_data_get_hostname_mode() is only used once, inside nm_policy_init(). There is no need to add an API for that and cache the value for the entire duration. Just use nm_config_data_get_value() in nm_policy_init(). But let's use a define #define NM_CONFIG_KEYFILE_KEY_MAIN_HOSTNAME "hostname" in nm-config.h > policy: add support to configurable hostname mode I am not convinced about the option name [main] hostname= it's not really the "hostname" but the "hostname-mode". The rest of the code also calls it "hostname-mode". also: + if nm_streq0 (hostname_mode, "none") + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_UNMANAGED; + else if nm_streq0 (hostname_mode, "dhcp") + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_DHCP; + else /* default - full mode */ + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_FULL; let's not do "none" vs. UNMANAGED? The names of NM_POLICY_HOSTNAME_MODE_* should correspond to the configuration options. + hostname_mode = nm_config_data_get_hostname_mode (nm_config_get_data_orig (config)); let's use NM_CONFIG_GET_DATA_ORIG. > policy: add some verbose logging for tracking hostname management all logging related to the same topic should have a common prefix (preferably, unique, so you can grep for it). Maybe "set-hostname:" (In reply to Thomas Haller from comment #19) > > policy: remove useless checks and redundant var when setting the hostname > > I don't understand this. Could you explain better in the commit message? Changed with: "policy: remove redundant check in _set_hostname" What I mean is that there are 2 checks and one "else" branch: 1) if ( priv->orig_hostname && (priv->hostname_changed == FALSE) && g_strcmp0 (priv->orig_hostname, new_hostname) == 0) { /* HERE NOTHING IS DONE */ 2) } else if (g_strcmp0 (priv->cur_hostname, new_hostname) == 0) { /* HERE NOTHING IS DONE */ 3) } else { /* HERE SOME "ACTION" TAKES PLACE */ As condition 2 is a subset of condition 1, when 1 is true, 2 will always be true too. And as the action taken for condition 1) and condition 2) is the same (NOTHING), condition 1 can be removed without functional change. This means also that "hostname_changed" var is now useless. Then, condition 2 can be reverted doing there the action of the "else" branch (3). So the whole "if/else if/else" can be substituted with a single "if" construct. + /* Update the DNS only if the hostname isn't actually + * going to change. + */ + if (!nm_streq0 (priv->cur_hostname, new_hostname)) { I don't want to be overkill in the commit msg... but I can put something like this ^^^ in if you think would be better. > > > policy: detect if the hostname was changed outside NetworkManager > > + /* Cancel ongoing lookup first*/ > + priv->lookup_cancellable = g_cancellable_new (); > > the comment seems wrong. Maybe also not move creation of lookup_cancellable > before /* Check if the hostname was externally set */ ? Sure, those are both wrong. Removing the comment and moving the lookup_cancellable creation back in the right place. > > > config: add new hostname_mode configuration > > nm_config_data_get_hostname_mode() is only used once, inside > nm_policy_init(). There is no need to add an API for that and cache the > value for the entire duration. Just use nm_config_data_get_value() in > nm_policy_init(). > > But let's use a define > #define NM_CONFIG_KEYFILE_KEY_MAIN_HOSTNAME "hostname" > in nm-config.h Done, thanks > > > policy: add support to configurable hostname mode > > I am not convinced about the option name > > [main] > hostname= > > it's not really the "hostname" but the "hostname-mode". The rest of the code > also calls it "hostname-mode". Yeah, seems better, changed > > > also: > + if nm_streq0 (hostname_mode, "none") > + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_UNMANAGED; > + else if nm_streq0 (hostname_mode, "dhcp") > + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_DHCP; > + else /* default - full mode */ > + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_FULL; > > let's not do "none" vs. UNMANAGED? The names of NM_POLICY_HOSTNAME_MODE_* > should correspond to the configuration options. Changed NM_POLICY_HOSTNAME_MODE_UNMANAGED --> NM_POLICY_HOSTNAME_MODE_NONE ("none" is coherent with "rc-managed" options) > > > + hostname_mode = nm_config_data_get_hostname_mode > (nm_config_get_data_orig (config)); > > let's use NM_CONFIG_GET_DATA_ORIG. Done > > > > policy: add some verbose logging for tracking hostname management > > all logging related to the same topic should have a common prefix > (preferably, unique, so you can grep for it). > Maybe > "set-hostname:" Done Thanks. Updated branch fg/keep_externally_set_hostname-rh1422610 > policy: remove redundant check in _set_hostname + /* Update the DNS only if the hostname isn't actually + * going to change. s/isn't/is/ ? Also, it seems the check was introduced in e04cbae to avoid spurious rewrites of resolv.conf. At start priv->cur_hostname is NULL, and we don't want to notify the DNS manager if @new_hostname is equal to priv->orig_hostname, right? > policy: detect if the hostname was changed outside NetworkManager /* Ask NMSettings to update the transient hostname using its * systemd-hostnamed proxy */ nm_settings_set_transient_hostname (priv->settings, name, settings_set_hostname_cb, - NULL); + self); I think you need to g_object_ref(self) to avoid that @self gets disposed before the callback is run. > policy: add support to configurable hostname mode + if nm_streq0 (hostname_mode, "none") + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_NONE; + else if nm_streq0 (hostname_mode, "dhcp") + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_DHCP; + else /* default - full mode */ + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_FULL; Add parenthesis around conditions after if. > policy: allow reset of dhcp hostname in "dhcp" hostname mode config Can you add a comment or commit message (or both) to explain why this is needed? (In reply to Beniamino Galvani from comment #21) > > policy: remove redundant check in _set_hostname > > + /* Update the DNS only if the hostname isn't actually > + * going to change. > > > s/isn't/is/ ? ops.. fixed, thanks > > Also, it seems the check was introduced in e04cbae to avoid spurious > rewrites of resolv.conf. At start priv->cur_hostname is NULL, and we > don't want to notify the DNS manager if @new_hostname is equal to > priv->orig_hostname, right? Yeah... a little bit convoluted but the DNS manager will skip the update if it gets an hostname it is already aware of. So, also this case should be safe. > > > > policy: detect if the hostname was changed outside NetworkManager > > /* Ask NMSettings to update the transient hostname using its > * systemd-hostnamed proxy */ > nm_settings_set_transient_hostname (priv->settings, > name, > settings_set_hostname_cb, > - NULL); > + self); > > I think you need to g_object_ref(self) to avoid that @self gets > disposed before the callback is run. Fixed, thanks > > > > policy: add support to configurable hostname mode > > + if nm_streq0 (hostname_mode, "none") > + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_NONE; > + else if nm_streq0 (hostname_mode, "dhcp") > + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_DHCP; > + else /* default - full mode */ > + priv->hostname_mode = NM_POLICY_HOSTNAME_MODE_FULL; > > Add parenthesis around conditions after if. Fixed > > > > policy: allow reset of dhcp hostname in "dhcp" hostname mode config > > Can you add a comment or commit message (or both) to explain why this > is needed? Sure, done. Updated branch fg/keep_externally_set_hostname-rh1422610 Merged upstream: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=017b7c2bb6545f48bfc9c0357769801a488a619a Note that the configuration key name as changed to "hostname-mode". We do not need the backport, please remove the z-stream flag unless you need it for anything else. (In reply to Yaniv Dary from comment #24) > We do not need the backport, please remove the z-stream flag unless you need > it for anything else. I don't need it too, thanks (and I cannot remove it). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2299 |