Bug 1246496
| Summary: | dhclient is terminated and won't start after restart NetworkManager | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | chocean | ||||||
| Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 7.0 | CC: | bgalvani, chocean, dcbw, jklimes, lrintel, rkhan, thaller, tlavigne, vbenes | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | NetworkManager-1.0.6-6.el7 | Doc Type: | Bug Fix | ||||||
| Doc Text: |
When NetworkManager was restarted with an active DHCP connection, a temporary fail of the DHCP server could result in a loss of connectivity. Now NetworkManager tries multiple times to obtain a DHCP address when an active DHCP connection is found upon restart.
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2015-11-19 11:02:30 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
chocean
2015-07-24 13:00:17 UTC
When NetworkManager is restarted, it should try to assume the existing DHCP connection and should spawn again dhclient, but if the DHCP server is not available dhclient will terminate after some time. How long do you wait to check for the presence of dhclient after NetworkManager has been restarted? Could you please enable verbose logging in /etc/NetworkManager/NetworkManager.conf adding (or changing) the following: [logging] level=DEBUG repeat the steps in bug description and attach the output of 'journalctl -u NetworkManager -b'? Please note that at the moment if the DHCP server can't be contacted within a given interval (45 seconds), the activation of the connection fails and DHCP is never retried automatically. Created attachment 1063193 [details]
network manager debug log
@Beniamino I have attached log you required. However, I do not think the behavior, "if the DHCP server can't be contacted within a given interval (45 seconds), the activation of the connection fails and DHCP is never retried automatically", is reasonable. Please conside a scenario like this: 1. servers use dhcp to get ip address 2. switch device for this network is broken by accident 3. dhcp client release ip address after leasing expired 4. the switch device is replaced and up again 5. all servers have network connection but none of them have ip address, because they do not retry dhcp client. (In reply to chocean from comment #4) > @Beniamino > I have attached log you required. Thanks. > However, I do not think the behavior, "if the DHCP server can't be contacted > within a given interval (45 seconds), the activation of the connection > fails and DHCP is never retried automatically", is reasonable. Please consider that the described behavior applies only to the case in which the connection is assumed, i.e. NetworkManager finds the device already configured with an IP address when starting. This happens, among other cases, when you restart NetworkManager; since it doesn't save persistent information about the state of connections, NM tries to reuse the existing configuration and doesn't take further initiative if DHCP fails. The problem, as you noticed, is related to the restart of NM. In my opinion, you should not restart it unless you have good reasons to, and in such case you should manually activate the connection with "nmcli c up CONN-ID". > Please conside a scenario like this: > > 1. servers use dhcp to get ip address > 2. switch device for this network is broken by accident > 3. dhcp client release ip address after leasing expired > 4. the switch device is replaced and up again > 5. all servers have network connection but none of them have ip address, > because they do not retry dhcp client. Unless the connection is assumed, this should work properly as the connection would fail but would be retried for a predefined number of times. And even if all tries fail, the whole procedure is restarted after 5 minutes. Created attachment 1069300 [details]
[PATCH] device: retry DHCP after timeout/expiration for assumed connections
Maybe we should try harder to keep the DHCP lease when the
connection is assumed, because otherwise a temporary failure in the
DHCP server will cause the device to move permanently to ACTIVATED
state without IP configuration.
(In reply to Beniamino Galvani from comment #6) > Created attachment 1069300 [details] > [PATCH] device: retry DHCP after timeout/expiration for assumed connections > > Maybe we should try harder to keep the DHCP lease when the > connection is assumed, because otherwise a temporary failure in the > DHCP server will cause the device to move permanently to ACTIVATED > state without IP configuration. LGTM Applied to master (998ab889495c) and nm-1-0 (2b9db2bf1b9f). minor bug filed: bug 1265239 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-2315.html |