Bug 1246496 - dhclient is terminated and won't start after restart NetworkManager
Summary: dhclient is terminated and won't start after restart NetworkManager
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager
Version: 7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Beniamino Galvani
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-24 13:00 UTC by chocean
Modified: 2015-11-19 11:02 UTC (History)
9 users (show)

Fixed In Version: NetworkManager-1.0.6-6.el7
Doc Type: Bug Fix
Doc Text:
When NetworkManager was restarted with an active DHCP connection, a temporary fail of the DHCP server could result in a loss of connectivity. Now NetworkManager tries multiple times to obtain a DHCP address when an active DHCP connection is found upon restart.
Clone Of:
Environment:
Last Closed: 2015-11-19 11:02:30 UTC
Target Upstream Version:


Attachments (Terms of Use)
network manager debug log (227.64 KB, text/x-vhdl)
2015-08-15 01:02 UTC, chocean
no flags Details
[PATCH] device: retry DHCP after timeout/expiration for assumed connections (2.15 KB, patch)
2015-09-02 08:44 UTC, Beniamino Galvani
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:2315 normal SHIPPED_LIVE Moderate: NetworkManager security, bug fix, and enhancement update 2015-11-19 10:06:58 UTC

Description chocean 2015-07-24 13:00:17 UTC
Description of problem:

dhclient is terminated and won't start after restart NetworkManager when dhcp server can not be reached

Version-Release number of selected component (if applicable):
I am actually using Centos7

How reproducible:


Steps to Reproduce:
1.precondition: use DHCP and NetworkManager to config ip address
2. systemctl restart NetworkManager
3. stop dhcp server

Actual results:

1. dhclient (started by NetworkManager) is terminated
2. dhclient won't start if run: systemctl restart NetworkManager
3. I can not get ip from dhcp server after the server is up


Expected results:
dhclient is running and can allocate new address when dhcp server is up again

Additional info:

"systemctl restart NetworkManager" is necessary to trigger this problem.

Comment 2 Beniamino Galvani 2015-08-05 12:27:50 UTC
When NetworkManager is restarted, it should try to assume the existing
DHCP connection and should spawn again dhclient, but if the DHCP
server is not available dhclient will terminate after some time.

How long do you wait to check for the presence of dhclient after
NetworkManager has been restarted? Could you please enable verbose
logging in /etc/NetworkManager/NetworkManager.conf adding (or
changing) the following:

[logging]
level=DEBUG

repeat the steps in bug description and attach the output of
'journalctl -u NetworkManager -b'?

Please note that at the moment if the DHCP server can't be contacted
within a given interval (45 seconds), the activation of the connection
fails and DHCP is never retried automatically.

Comment 3 chocean 2015-08-15 01:02:09 UTC
Created attachment 1063193 [details]
network manager debug log

Comment 4 chocean 2015-08-15 01:11:22 UTC
@Beniamino
I have attached log you required.

However, I do not think the behavior, "if the DHCP server can't be contacted
within a given interval (45 seconds), the activation of the connection
fails and DHCP is never retried automatically", is reasonable.

Please conside a scenario like this:

1. servers use dhcp to get ip address
2. switch device for this network is broken by accident
3. dhcp client release ip address after leasing expired
4. the switch device is replaced and up again
5. all servers have network connection but none of them have ip address, because they do not retry dhcp client.

Comment 5 Beniamino Galvani 2015-08-18 21:15:29 UTC
(In reply to chocean from comment #4)
> @Beniamino
> I have attached log you required.

Thanks.

> However, I do not think the behavior, "if the DHCP server can't be contacted
> within a given interval (45 seconds), the activation of the connection
> fails and DHCP is never retried automatically", is reasonable.

Please consider that the described behavior applies only to the case
in which the connection is assumed, i.e. NetworkManager finds the
device already configured with an IP address when starting. This
happens, among other cases, when you restart NetworkManager; since it
doesn't save persistent information about the state of connections, NM
tries to reuse the existing configuration and doesn't take further
initiative if DHCP fails.

The problem, as you noticed, is related to the restart of NM. In my
opinion, you should not restart it unless you have good reasons to,
and in such case you should manually activate the connection with
"nmcli c up CONN-ID".

> Please conside a scenario like this:
>
> 1. servers use dhcp to get ip address
> 2. switch device for this network is broken by accident
> 3. dhcp client release ip address after leasing expired
> 4. the switch device is replaced and up again
> 5. all servers have network connection but none of them have ip address,
> because they do not retry dhcp client.

Unless the connection is assumed, this should work properly as the
connection would fail but would be retried for a predefined number of
times. And even if all tries fail, the whole procedure is restarted
after 5 minutes.

Comment 6 Beniamino Galvani 2015-09-02 08:44:03 UTC
Created attachment 1069300 [details]
[PATCH] device: retry DHCP after timeout/expiration for assumed connections

Maybe we should try harder to keep the DHCP lease when the
connection is assumed, because otherwise a temporary failure in the
DHCP server will cause the device to move permanently to ACTIVATED
state without IP configuration.

Comment 7 Thomas Haller 2015-09-08 10:13:41 UTC
(In reply to Beniamino Galvani from comment #6)
> Created attachment 1069300 [details]
> [PATCH] device: retry DHCP after timeout/expiration for assumed connections
> 
> Maybe we should try harder to keep the DHCP lease when the
> connection is assumed, because otherwise a temporary failure in the
> DHCP server will cause the device to move permanently to ACTIVATED
> state without IP configuration.

LGTM

Comment 8 Beniamino Galvani 2015-09-08 11:35:10 UTC
Applied to master (998ab889495c) and nm-1-0 (2b9db2bf1b9f).

Comment 10 Vladimir Benes 2015-09-22 13:02:01 UTC
minor bug filed: bug 1265239

Comment 11 errata-xmlrpc 2015-11-19 11:02:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-2315.html


Note You need to log in before you can comment on or make changes to this bug.