Bug 1269093

Summary: dhclient called from NetworkManager doesn't use stable DUID
Product: [Fedora] Fedora Reporter: Pavel Šimerda (pavlix) <psimerda>
Component: NetworkManagerAssignee: Lubomir Rintel <lkundrak>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: cheimes, dcbw, ja, lkundrak, lslebodn, mbasti, psimerda, pspacek, thaller
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-20 14:53:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 883152    

Description Pavel Šimerda (pavlix) 2015-10-06 10:08:00 UTC
Description of problem:

When a dynamic ethernet connection is used, dhclient generates 
a new DUID at each start. On some system and networks this results in getting different IP address each time NetworkManager is started.

Thanks lslebodn and pspacek for input.

Version-Release number of selected component (if applicable):

systemd-222-6.fc23.x86_64
NetworkManager-1.0.6-6.fc23.x86_64
dhcp-client-4.3.3-1.fc23.x86_64

(updated F23 beta)

How reproducible:


Steps to Reproduce:
1. mv /etc/sysconfig/network-scripts/ifcfg-eth0 /root/
2. systemctl restart NetworkManager
3. journalctl --since today | grep duid

Actual results:

A new log message from dhclient starting containing "Created duid".


Expected results:

No new log message.


Additional info:

The original motivation for this bug was that this happens in the default installation. That is not my case though with my F23 installation due to a combination with another bug I'm facing. Because of that the first step was needed in my case.

In my case a connection for "eth0" is used for "ens3" which thaller confirmed is not the expected behavior. Shall we start a new bug or handle it at once?

Comment 1 Pavel Šimerda (pavlix) 2015-10-06 12:00:14 UTC
Proper DUID management is necessary for IPv4 and IPv6 automatic configuration.

Comment 2 Dr J Austin 2016-01-25 16:24:13 UTC
I can confirm that this causes complete DNS corruption on my system
This must be a serious problem!
NetworkManager-1.0.10-2.fc23.x86_64
dhcp-client-4.3.3-8.P1.fc23.x86_64
paxos 4.3.3-301.fc23.x86_64

It does NOT occur on a NON updated F23 system
NetworkManager-1.0.6-8.fc23.x86_64
dhcp-client-4.3.3-7.fc23.x86_64
naxos 4.2.8-300.fc23.x86_64

From a "clean" start of dhcp/ddns on a Centos6.7 server the first boot
of a F23 client machine (presumably the first use of NetworkManager)
obtains an IP address and updates the DNS as expected.

Subsequent "systemctl restart NetworkManager.service" on the client,
or "reboots" of the client
or "natural" requests for lease renewal from the client
obtains a new IP address but does not/cannot update the DNS RRs

Should DHCPDISCOVER be issued for a lease renewal request ????

First boot
Jan 25 15:05:02 maui dhcpd: DHCPDISCOVER from 78:24:af:3a:7e:3a via eth0
Jan 25 15:05:03 maui dhcpd: DHCPOFFER on 148.197.29.129 to 78:24:af:3a:7e:3a (paxos) via eth0
Jan 25 15:05:03 maui named[2188]: client 148.197.29.5#44594: updating zone 'jaa.org.uk/IN': adding an RR at 'paxos.jaa.org.uk' A
Jan 25 15:05:03 maui named[2188]: client 148.197.29.5#44594: updating zone 'jaa.org.uk/IN': adding an RR at 'paxos.jaa.org.uk' TXT
Jan 25 15:05:03 maui dhcpd: Added new forward map from paxos.jaa.org.uk to 148.197.29.129
Jan 25 15:05:03 maui named[2188]: client 148.197.29.5#43414: updating zone '29.197.148.in-addr.arpa/IN': deleting rrset at '129.29.197.148.in-addr.arpa' PTR
Jan 25 15:05:03 maui named[2188]: client 148.197.29.5#43414: updating zone '29.197.148.in-addr.arpa/IN': adding an RR at '129.29.197.148.in-addr.arpa' PTR
Jan 25 15:05:03 maui dhcpd: added reverse map from 129.29.197.148.in-addr.arpa. to paxos.jaa.org.uk
Jan 25 15:05:03 maui dhcpd: DHCPREQUEST for 148.197.29.129 (148.197.29.5) from 78:24:af:3a:7e:3a (paxos) via eth0
Jan 25 15:05:03 maui dhcpd: DHCPACK on 148.197.29.129 to 78:24:af:3a:7e:3a (paxos) via eth0

Subsequent use of "systemctl restart NetworkManager.service", reboot, ...
 
Jan 25 15:42:32 maui dhcpd: DHCPDISCOVER from 78:24:af:3a:7e:3a via eth0
Jan 25 15:42:33 maui dhcpd: DHCPOFFER on 148.197.29.130 to 78:24:af:3a:7e:3a (paxos) via eth0
Jan 25 15:42:33 maui named[2188]: client 148.197.29.5#44704: updating zone 'jaa.org.uk/IN': update unsuccessful: paxos.jaa.org.uk: 'name not in use' prerequisite not satisfied (YXDOMAIN)
Jan 25 15:42:33 maui named[2188]: client 148.197.29.5#42468: updating zone 'jaa.org.uk/IN': update unsuccessful: paxos.jaa.org.uk/TXT: 'RRset exists (value dependent)' prerequisite not satisfied (NXRRSET)
Jan 25 15:42:33 maui dhcpd: Forward map from paxos.jaa.org.uk to 148.197.29.130 FAILED: Has an address record but no DHCID, not mine.
Jan 25 15:42:33 maui dhcpd: DHCPREQUEST for 148.197.29.130 (148.197.29.5) from 78:24:af:3a:7e:3a (paxos) via eth0
Jan 25 15:42:33 maui dhcpd: DHCPACK on 148.197.29.130 to 78:24:af:3a:7e:3a (paxos) via eth0

and again
Jan 25 15:43:33 maui dhcpd: DHCPDISCOVER from 78:24:af:3a:7e:3a via eth0
Jan 25 15:43:34 maui dhcpd: DHCPOFFER on 148.197.29.131 to 78:24:af:3a:7e:3a (paxos) via eth0
Jan 25 15:43:34 maui named[2188]: client 148.197.29.5#44059: updating zone 'jaa.org.uk/IN': update unsuccessful: paxos.jaa.org.uk: 'name not in use' prerequisite not satisfied (YXDOMAIN)
Jan 25 15:43:34 maui named[2188]: client 148.197.29.5#33780: updating zone 'jaa.org.uk/IN': update unsuccessful: paxos.jaa.org.uk/TXT: 'RRset exists (value dependent)' prerequisite not satisfied (NXRRSET)
Jan 25 15:43:34 maui dhcpd: Forward map from paxos.jaa.org.uk to 148.197.29.131 FAILED: Has an address record but no DHCID, not mine.
Jan 25 15:43:34 maui dhcpd: DHCPREQUEST for 148.197.29.131 (148.197.29.5) from 78:24:af:3a:7e:3a (paxos) via eth0
Jan 25 15:43:34 maui dhcpd: DHCPACK on 148.197.29.131 to 78:24:af:3a:7e:3a (paxos) via eth0

Investigation shows that dhclient is using a different lease file with a different value for default-duid when NetworkManager has been restarted,
First boot
/sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-eno1.pid \
-lf /var/lib/NetworkManager/dhclient-37107b25-12bf-4de0-935e-12b9b6062fb4-eno1.lease \
-cf /var/lib/NetworkManager/dhclient-eno1.conf eno1
...
default-duid "\000\001\000\001\0368\255;x$\257:~:";

Subsequent update, reboot, ...
/sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-eno1.pid -lf \
/var/lib/NetworkManager/dhclient-81ded380-936a-405c-83f2-bfd5b980f69d-eno1.lease \
-cf /var/lib/NetworkManager/dhclient-eno1.conf eno1
...
default-duid "\000\001\000\001\0368\264\335x$\257:~:";

Comment 3 Dan Williams 2016-01-25 16:57:38 UTC
The root cause of the issue here is that there is no permanent connection profile for the interface, and therefore all information (including the DHCP lease) is thrown away each time NM restarts.  If there was a permanent connection profile, this would not occur since the existing lease would be save and thus available for renewal when NM starts again.

Something like:

nmcli con mod eno1 connection.autoconnect no
nmcli con mod eno1 connection.autoconnect yes

will cause NM to write the connection out to disk as it has been explicitly modified by the user.

The DUID is only used for DHCPv6 and thus isn't relevant to the problem here, but it shouldn't be changing as long as /etc/machine-id or /var/lib/dbus/machine-id is present.

Comment 4 Petr Spacek 2016-01-26 07:20:09 UTC
DUID is used also with IPv4. It is an optional field and when it is present it typically superseedes MAC address, so this is very much relevant for IPv4 and is actually causing operational problems.

Comment 5 Dr J Austin 2016-01-26 11:27:43 UTC
Many thanks for the prompt feedback

I have tested out the use of the two commands
nmcli con mod eno1 connection.autoconnect no
nmcli con mod eno1 connection.autoconnect yes
and by magic the client no longer issues DHCPDISCOVER
every time NetworkManager is restarted and hence does not
request and obtain a new IP address.

I also checked what happened when the leases ran out naturally by
setting "default-lease-time  300;" in
dhcpd.conf on the server

Again all was well

I have not yet worked out how often the magic incantation
is required - once after a new installation or once
before the lease runs out or ... ??????????????????????

Should NetworkManager default to creating a "permanent connection profile"?

Aside:
I suspect that originally the client may not have been
deleting the old IP address from the interface but was adding
the new IP as a "secondary".
At one stage during testing "ip a s" gave multiple entries
of the form
inet 148.197.29.133/24 brd 148.197.29.255 scope global dynamic eno1             


This might explain my original symptoms - after several days the client
and server would both freeze with the network busily flashing away.
The only way out was to press the button on all machines.
I originally thought it was a hardware problem but now suspect
that NetworkManager may have been the cause of the crashes.
At least my machines, router, switches, cables, ...
have all been cleaned and dusted!

Comment 6 Christian Heimes 2016-01-29 15:31:12 UTC
The problem is affecting me, too. My Fedora VMs don't retain their IPv4 addresses.

Comment 7 Dr J Austin 2016-01-30 09:49:32 UTC
Have you by any chanced "cloned" your virtual machines?

I now think that maybe my problems were to do with the way I cloned my particular client (hardware) machines.

I used a variation of
Create a valid F23 master machine/SSD and dump it whilst unmounted
...
dump 0f /mnt/zip/F22_dump_2015_06_22 /dev/sdb5
...
Restore to the disk/SSD of another machine
restore -rf /mnt/zip/F22_dump_2015_06_22
(This took less than 120s)
...
mount /dev/sda6 /mnt/zip
mount -o bind /dev  /mnt/zip/dev
mount -o bind /proc /mnt/zip/proc
mount -o bind /sys  /mnt/zip/sys
chroot /mnt/zip
gedit /etc/fstab
gedit /etc/hostname
Sort out grub2

I used to
gedit /etc/sysconfig/network-scripts/ifcfg-eno1
but recently I have been lazy and let
NM "sort it out" - for F23 I think it/I didn't get it quite right!

Adding an extra "mount -o bind /run /mnt/zip/run" to the list above
lets me use nmcli in the chroot environment but it is not quite
clear to me exactly what to do at that stage!

Comment 8 Martin Bašti 2016-02-01 16:47:54 UTC
I was hit by this bug as well.

Comment 9 Fedora End Of Life 2016-11-24 12:42:10 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Fedora End Of Life 2016-12-20 14:53:27 UTC
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.