Red Hat Bugzilla – Bug 1246496
dhclient is terminated and won't start after restart NetworkManager
Last modified: 2015-11-19 06:02:30 EST
Description of problem: dhclient is terminated and won't start after restart NetworkManager when dhcp server can not be reached Version-Release number of selected component (if applicable): I am actually using Centos7 How reproducible: Steps to Reproduce: 1.precondition: use DHCP and NetworkManager to config ip address 2. systemctl restart NetworkManager 3. stop dhcp server Actual results: 1. dhclient (started by NetworkManager) is terminated 2. dhclient won't start if run: systemctl restart NetworkManager 3. I can not get ip from dhcp server after the server is up Expected results: dhclient is running and can allocate new address when dhcp server is up again Additional info: "systemctl restart NetworkManager" is necessary to trigger this problem.
When NetworkManager is restarted, it should try to assume the existing DHCP connection and should spawn again dhclient, but if the DHCP server is not available dhclient will terminate after some time. How long do you wait to check for the presence of dhclient after NetworkManager has been restarted? Could you please enable verbose logging in /etc/NetworkManager/NetworkManager.conf adding (or changing) the following: [logging] level=DEBUG repeat the steps in bug description and attach the output of 'journalctl -u NetworkManager -b'? Please note that at the moment if the DHCP server can't be contacted within a given interval (45 seconds), the activation of the connection fails and DHCP is never retried automatically.
Created attachment 1063193 [details] network manager debug log
@Beniamino I have attached log you required. However, I do not think the behavior, "if the DHCP server can't be contacted within a given interval (45 seconds), the activation of the connection fails and DHCP is never retried automatically", is reasonable. Please conside a scenario like this: 1. servers use dhcp to get ip address 2. switch device for this network is broken by accident 3. dhcp client release ip address after leasing expired 4. the switch device is replaced and up again 5. all servers have network connection but none of them have ip address, because they do not retry dhcp client.
(In reply to chocean from comment #4) > @Beniamino > I have attached log you required. Thanks. > However, I do not think the behavior, "if the DHCP server can't be contacted > within a given interval (45 seconds), the activation of the connection > fails and DHCP is never retried automatically", is reasonable. Please consider that the described behavior applies only to the case in which the connection is assumed, i.e. NetworkManager finds the device already configured with an IP address when starting. This happens, among other cases, when you restart NetworkManager; since it doesn't save persistent information about the state of connections, NM tries to reuse the existing configuration and doesn't take further initiative if DHCP fails. The problem, as you noticed, is related to the restart of NM. In my opinion, you should not restart it unless you have good reasons to, and in such case you should manually activate the connection with "nmcli c up CONN-ID". > Please conside a scenario like this: > > 1. servers use dhcp to get ip address > 2. switch device for this network is broken by accident > 3. dhcp client release ip address after leasing expired > 4. the switch device is replaced and up again > 5. all servers have network connection but none of them have ip address, > because they do not retry dhcp client. Unless the connection is assumed, this should work properly as the connection would fail but would be retried for a predefined number of times. And even if all tries fail, the whole procedure is restarted after 5 minutes.
Created attachment 1069300 [details] [PATCH] device: retry DHCP after timeout/expiration for assumed connections Maybe we should try harder to keep the DHCP lease when the connection is assumed, because otherwise a temporary failure in the DHCP server will cause the device to move permanently to ACTIVATED state without IP configuration.
(In reply to Beniamino Galvani from comment #6) > Created attachment 1069300 [details] > [PATCH] device: retry DHCP after timeout/expiration for assumed connections > > Maybe we should try harder to keep the DHCP lease when the > connection is assumed, because otherwise a temporary failure in the > DHCP server will cause the device to move permanently to ACTIVATED > state without IP configuration. LGTM
Applied to master (998ab889495c) and nm-1-0 (2b9db2bf1b9f).
minor bug filed: bug 1265239
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-2315.html