Bug 1507864 - Network Manager fails to detect duplicate ipv4 IP address [NEEDINFO]
Summary: Network Manager fails to detect duplicate ipv4 IP address
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager   
(Show other bugs)
Version: 7.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Beniamino Galvani
QA Contact: Desktop QE
Ioanna Gkioka
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 1554861
TreeView+ depends on / blocked
 
Reported: 2017-10-31 10:52 UTC by Sangam
Modified: 2018-10-30 11:12 UTC (History)
12 users (show)

Fixed In Version: NetworkManager-1.12.0-0.1.el7
Doc Type: Bug Fix
Doc Text:
*NetworkManager* no longer fails to detect duplicate IPv4 addresses Previously, *NetworkManager* used to spawn an instance of the `arping` process to detect duplicate IPv4 addresses on the network. As a consequence, if the timeout configured for IPv4 Duplicate Address Detection (DAD) was short and the system was overloaded, *NetworkManager* sometimes failed to detect a duplicate address in time. With this update, the detection of duplicate IPv4 addresses is now performed internally to *NetworkManager* without spawning external binaries, and the described problem no longer occurs.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-30 11:11:02 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
vbenes: needinfo? (sababu)


Attachments (Terms of Use)
NM log when dad-timeout was set to 2000 (454.32 KB, text/plain)
2017-10-31 16:12 UTC, noah davids
no flags Details
NM log was dad timeout was set to 2001 (335.97 KB, text/plain)
2017-10-31 16:13 UTC, noah davids
no flags Details
Dad-timeout set to 20 and dad working intermittently (237.90 KB, text/plain)
2017-11-02 09:23 UTC, Sangam
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3207 None None None 2018-10-30 11:12 UTC
Red Hat Bugzilla 1259063 None None None 2019-04-15 06:46 UTC

Description Sangam 2017-10-31 10:52:53 UTC
Description of problem: Network Manager fails to detect duplicate IPV4 IP address 


Version-Release number of selected component (if applicable):


How reproducible: There are 2 systems in the same subnet, One of the system RHEL 7.4(without NM) and the other RHEL 7.4 (NetworkManager-1.8.0-9.el7), assign the same IP address as that of RHEL6 system and bring up the interface. IP gets assigned and will come up without showing any error.


Steps to Reproduce:
1.RHEL 7.4(3.10.0-693.el7.x86_64) system without NetworkManager, other system is RHEL 7.4(3.10.0-693.el7.x86_64) with Network Manager(NetworkManager-1.8.0-9.el7)
2. Assign the Ip address of System without NM to the system with NM and activate the device.
3. Ip address gets assigned and there are no errors seen anywhere in the logs. 

    Have tested by increasing the ipv4.dad-timeout to 5 seconds but that does not change the behavior.

Actual results: Network Manager does not detect the duplicate IPV4 address in the network and assigns the IP to system which causes outages.


Expected results: Network Manager should detect the duplicate IPV4 address.


Additional info:  These were from previous version where the fix was there, with the latest version of NM the issue can be seen.

commit 80e8b9cca931b13273471767a60756524605d3a0
Merge: f607a16 75068a0
Author: Beniamino Galvani <bgalvani@redhat.com>
Date:   Wed Jan 20 11:57:13 2016 +0100

    merge: branch 'bg/ipv4-dad-rh1259063'
    
    Add support for detecting IPv4 duplicate addresses.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1259063

commit 75068a085a2ae4707f875a13ced082be653ef4cb
Author: Beniamino Galvani <bgalvani@redhat.com>
Date:   Wed Dec 23 17:59:08 2015 +0100

    device: detect duplicate IPv4 addresses when method=auto

commit 28f6e8b4d2ae554042027cb7af261289eb07e1e4
Author: Beniamino Galvani <bgalvani@redhat.com>
Date:   Wed Dec 23 14:15:05 2015 +0100

    device: detect duplicate IPv4 addresses when method=manual


Testing
==========
Systems used during testing.

1. 

#cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.4(Maipo)

# uname -a
Linux localhost.localdomain 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux


# rpm -qa NetworkManager
NetworkManager-1.8.0-9.el7.x86_64

# ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:1a:4a:00:03:92 brd ff:ff:ff:ff:ff:ff
    inet 10.74.251.126/21 brd 10.74.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::21a:4aff:fe00:392/64 scope link 
       valid_lft forever preferred_lft forever

# systemctl restart network
[root@localhost ~]# systemctl status network
â network.service - LSB: Bring up/down networking
   Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled)
   Active: active (exited) since Sun 2017-10-29 01:49:28 IST; 1s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 2389 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=0/SUCCESS)

Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain network[2389]: RTNETLINK answers: File exists
Oct 29 01:49:28 localhost.localdomain systemd[1]: Started LSB: Bring up/down networking.


2. 

#cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.4(Maipo)

#uname -a
Linux localhost.localdomain 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

# rpm -qa NetworkManager
NetworkManager-1.8.0-9.el7.x86_64

# ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:1a:4a:00:03:90 brd ff:ff:ff:ff:ff:ff
    inet 10.74.251.126/21 brd 10.74.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a7d3:5161:eeed:7d98/64 scope link 
       valid_lft forever preferred_lft forever


# systemctl status NetworkManager
â NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2017-10-28 18:48:22 IST; 2h 40min ago
     Docs: man:NetworkManager(8)
 Main PID: 750 (NetworkManager)
   CGroup: /system.slice/NetworkManager.service
           ââ750 /usr/sbin/NetworkManager --no-daemon

Oct 28 21:28:04 localhost.localdomain NetworkManager[750]: <info>  [1509206284.8747] manager: NetworkManager state is now CONNECTING
Oct 28 21:28:04 localhost.localdomain NetworkManager[750]: <info>  [1509206284.8752] device (eth0): state change: prepare -> config (reas... 50 0]
Oct 28 21:28:04 localhost.localdomain NetworkManager[750]: <info>  [1509206284.9814] device (eth0): state change: config -> ip-config (re... 70 0]
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0187] device (eth0): state change: ip-config -> ip-check (... 80 0]
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0211] device (eth0): state change: ip-check -> secondaries... 90 0]
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0214] device (eth0): state change: secondaries -> activate...100 0]
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0216] manager: NetworkManager state is now CONNECTED_LOCAL
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0625] manager: NetworkManager state is now CONNECTED_GLOBAL
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0627] policy: set 'eth0' (eth0) as default for IPv4 routing and DNS
Oct 28 21:28:05 localhost.localdomain NetworkManager[750]: <info>  [1509206285.0630] device (eth0): Activation: successful, device activated.
Hint: Some lines were ellipsized, use -l to show in full.

Comment 2 Beniamino Galvani 2017-10-31 14:42:52 UTC
(In reply to Sangam from comment #0) 
> How reproducible: There are 2 systems in the same subnet, One of the system
> RHEL 7.4(without NM) and the other RHEL 7.4 (NetworkManager-1.8.0-9.el7),
> assign the same IP address as that of RHEL6 system and bring up the
> interface. IP gets assigned and will come up without showing any error.
> 
> 
> Steps to Reproduce:
> 1.RHEL 7.4(3.10.0-693.el7.x86_64) system without NetworkManager, other
> system is RHEL 7.4(3.10.0-693.el7.x86_64) with Network
> Manager(NetworkManager-1.8.0-9.el7)
> 2. Assign the Ip address of System without NM to the system with NM and
> activate the device.
> 3. Ip address gets assigned and there are no errors seen anywhere in the
> logs. 
> 
>     Have tested by increasing the ipv4.dad-timeout to 5 seconds but that
> does not change the behavior.

Note that the value is in milliseconds... does the following work?

 nmcli connection modify <con-name> ipv4.dad-timeout 2000
 nmcli connection up <con-name>

> Actual results: Network Manager does not detect the duplicate IPV4 address
> in the network and assigns the IP to system which causes outages.

Can you please run the following and attach the log.txt file?

 nmcli general logging level TRACE
 nmcli connection modify <con-name> ipv4.dad-timeout 2000
 nmcli connection up <con-name>
 # verify that the connection succeeds, then
 journalctl -u NetworkManager --since "3min ago" > log.txt

Thanks!

Comment 3 noah davids 2017-10-31 14:47:21 UTC
Note that in my tests dad.timeout had to be at least 2001 for DAD to work

Comment 4 Beniamino Galvani 2017-10-31 15:09:30 UTC
(In reply to noah davids from comment #3)
> Note that in my tests dad.timeout had to be at least 2001 for DAD to work

Can you attach journal logs with dad-timeout=2000 ?

Comment 5 noah davids 2017-10-31 16:09:27 UTC
========================================================
========================================================
        dad-timeout set to 2000
========================================================
========================================================

# nmcli con show ens3 | grep -E "dad-timeout|may-fail"
ipv4.may-fail:                          yes
ipv4.dad-timeout:                       2000
ipv6.may-fail:                          yes
[root@rhel-7-4 ~]# reboot
Connection to 192.168.1.23 closed by remote host.
Connection to 192.168.1.23 closed.
[ndavids@ndavids 1962063]$ ssh root@192.168.1.23
root@192.168.1.23's password: 
Last login: Tue Oct 31 08:48:27 2017 from 192.168.1.9
[root@rhel-7-4 ~]# ip a l ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:8a:8b:6d brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.200/24 brd 192.168.1.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 2600:8800:78e:2000::8/128 scope global dynamic 
       valid_lft 3492sec preferred_lft 1692sec
    inet6 2600:8800:78e:2000:5054:ff:fe8a:8b6d/64 scope global noprefixroute dynamic 
       valid_lft 86392sec preferred_lft 86392sec
    inet6 fe80::5054:ff:fe8a:8b6d/64 scope link 
       valid_lft forever preferred_lft forever
[root@rhel-7-4 ~]# nmcli con show ens3 | grep -E "dad-timeout|may-fail"
ipv4.may-fail:                          yes
ipv4.dad-timeout:                       2000
ipv6.may-fail:                          yes
[root@rhel-7-4 ~]# journalctl -u NetworkManager --since "3min ago" > dad-2000-log.txt
# cat dad-2000-log.txt | grep -i dad
Oct 31 08:50:26 rhel-7-4 NetworkManager[784]: <debug> [1509465026.3309] ++ ipv4dad-timeout          = 2000
Oct 31 08:50:28 rhel-7-4 NetworkManager[784]: <debug> [1509465028.7753] arping[0x5582c86c9b80,2]: DAD timed out for 192.168.1.200
Oct 31 08:50:28 rhel-7-4 NetworkManager[784]: <debug> [1509465028.7754] device[0x5582c8696180] (ens3): IPv4 DAD result: address 192.168.1.200 is unique
Oct 31 08:50:29 rhel-7-4 NetworkManager[784]: <debug> [1509465029.9590] device[0x5582c8696180] (ens3): IPv6 DAD: awaiting termination
Oct 31 08:50:31 rhel-7-4 NetworkManager[784]: <debug> [1509465031.6448] device[0x5582c8696180] (ens3): IPv6 DAD terminated
[root@rhel-7-4 ~]# 

$ tshark -r network-manager-dad-timeout-2000.pcapng 2>/dev/null
  1 0.000000000 0.000000000 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.200? Tell 0.0.0.0
  2 0.000122445 0.000122445 Netgear_bc:cb:c9   RealtekU_8a:8b:6d 60  192.168.1.200 is at 00:09:5b:bc:cb:c9
  3 0.006404122 0.006281677 RealtekU_8a:8b:6d   Broadcast    60  Gratuitous ARP for 192.168.1.200 (Reply) (duplicate use of 192.168.1.200 detected!)
  4 2.084537306 2.078133184 RealtekU_8a:8b:6d   Broadcast    60  Gratuitous ARP for 192.168.1.200 (Request) (duplicate use of 192.168.1.200 detected!)
  5 3.200946555 1.116409249 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.48? Tell 192.168.1.200 (duplicate use of 192.168.1.200 detected!)
  6 3.201130196 0.000183641 SegateTe_4f:b3:6e   RealtekU_8a:8b:6d 60  192.168.1.48 is at 00:10:75:4f:b3:6e (duplicate use of 192.168.1.200 detected!)
  7 3.397276376 0.196146180 RealtekU_8a:8b:6d   Netgear_89:47:78 60  192.168.1.22 is at 52:54:00:8a:8b:6d
$


========================================================
========================================================
        dad-timeout set to 2001
========================================================
========================================================

# nmcli con mod ens3 ipv4.dad-timeout 2001
[root@rhel-7-4 ~]# nmcli con show ens3 | grep -E "dad-timeout|may-fail"
ipv4.may-fail:                          yes
ipv4.dad-timeout:                       2001
ipv6.may-fail:                          yes
[root@rhel-7-4 ~]# reboot
Connection to 192.168.1.23 closed by remote host.
Connection to 192.168.1.23 closed.
[ndavids@ndavids 1962063]$ ssh root@192.168.1.23
root@192.168.1.23's password: 
Last login: Tue Oct 31 08:52:09 2017 from 192.168.1.9
[root@rhel-7-4 ~]# ip a l ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:8a:8b:6d brd ff:ff:ff:ff:ff:ff
[root@rhel-7-4 ~]#  nmcli con show ens3 | grep -E "dad-timeout|may-fail"
ipv4.may-fail:                          yes
ipv4.dad-timeout:                       3000
ipv6.may-fail:                          yes
[root@rhel-7-4 ~]# journalctl -u NetworkManager --since "3min ago" > dad-2001-log.txt
[root@rhel-7-4 ~]# cat dad-2001-log.txt | grep -i dad
Oct 31 09:00:37 rhel-7-4 NetworkManager[789]: <debug> [1509465637.2668] ++ ipv4dad-timeout          = 3000
Oct 31 09:00:39 rhel-7-4 NetworkManager[789]: <warn>  [1509465639.6829] device (ens3): IPv4 DAD result: address 192.168.1.200 is duplicate
Oct 31 09:00:39 rhel-7-4 NetworkManager[789]: <warn>  [1509465639.7249] device (ens3): IPv4 DAD result: address 192.168.1.200 is duplicate
Oct 31 09:00:39 rhel-7-4 NetworkManager[789]: <warn>  [1509465639.7574] device (ens3): IPv4 DAD result: address 192.168.1.200 is duplicate
Oct 31 09:00:39 rhel-7-4 NetworkManager[789]: <warn>  [1509465639.7888] device (ens3): IPv4 DAD result: address 192.168.1.200 is duplicate
Oct 31 09:00:40 rhel-7-4 NetworkManager[789]: <warn>  [1509465640.0859] device (ens3): IPv4 DAD result: address 192.168.1.200 is duplicate
[root@rhel-7-4 ~]# 

$ tshark -r network-manager-dad-timeout-2001.pcapng 2>/dev/null
  1 0.000000000 0.000000000 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.200? Tell 0.0.0.0
  2 0.000117706 0.000117706 Netgear_bc:cb:c9   RealtekU_8a:8b:6d 60  192.168.1.200 is at 00:09:5b:bc:cb:c9
  3 0.041923440 0.041805734 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.200? Tell 0.0.0.0
  4 0.042046271 0.000122831 Netgear_bc:cb:c9   RealtekU_8a:8b:6d 60  192.168.1.200 is at 00:09:5b:bc:cb:c9
  5 0.078561367 0.036515096 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.200? Tell 0.0.0.0
  6 0.078697410 0.000136043 Netgear_bc:cb:c9   RealtekU_8a:8b:6d 60  192.168.1.200 is at 00:09:5b:bc:cb:c9
  7 0.107741160 0.029043750 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.200? Tell 0.0.0.0
  8 0.107874361 0.000133201 Netgear_bc:cb:c9   RealtekU_8a:8b:6d 60  192.168.1.200 is at 00:09:5b:bc:cb:c9
  9 0.405489357 0.297614996 RealtekU_8a:8b:6d   Broadcast    60  Who has 192.168.1.200? Tell 0.0.0.0
 10 0.405627387 0.000138030 Netgear_bc:cb:c9   RealtekU_8a:8b:6d 60  192.168.1.200 is at 00:09:5b:bc:cb:c9
$

Comment 6 noah davids 2017-10-31 16:12 UTC
Created attachment 1345999 [details]
NM log when dad-timeout was set to 2000

Comment 7 noah davids 2017-10-31 16:13 UTC
Created attachment 1346000 [details]
NM log was dad timeout was set to 2001

Comment 9 Sangam 2017-11-02 09:23 UTC
Created attachment 1346923 [details]
Dad-timeout set to 20 and dad working intermittently

Comment 10 Beniamino Galvani 2017-12-06 09:48:49 UTC
(In reply to Sangam from comment #9)
> Created attachment 1346923 [details]
> Dad-timeout set to 20 and dad working intermittently

20 milliseconds is a very low value, please increase it to at least 1000.

Comment 11 Beniamino Galvani 2017-12-06 09:50:13 UTC
When we start arping to detect duplicate addresses, we set a timer to
the timeout configured in the connection. If the timer hits before
arping returns we consider the address unique. The problem is that
there is no guarantee that in such interval arping was actually
started and sent out any requests, because of external factors as
slow disk I/O or overloaded system.

Fixing this is not trivial and requires a rework of the duplicate
address detection code to not use arping. For example we could reuse
n-acd [1] code (by importing it in our tree until it is available as a
standalone library). This rework requires some effort and I don't
think it can make it to 7.5.

[1] https://github.com/nettools/n-acd

Comment 12 Beniamino Galvani 2018-04-01 13:51:59 UTC
Please review branch bg/n-acd-rh1507864.

Comment 13 Thomas Haller 2018-04-04 12:01:21 UTC
+src_libNetworkManager_la_CPPFLAGS = \
+    $(src_cppflags) \
+    -I$(srcdir)/lib \
+    -I$(builddir)/lib

/lib, just like /shared seems something that should be available to every component. Especially, since one still needs to specify the directory like #include "nacd/...", I would add /lib as search path for everything (like shared).

Comment 14 Thomas Haller 2018-04-04 18:39:50 UTC
+         hwaddr_str = nm_utils_hwaddr_ntoa (event->defended.sender,
+                                            event->defended.n_sender);
+         _LOGD ("defended address %s from host %s", address_str, hwaddr_str);


could we only construct hwaddr_str if it's actually needed? The logging macros evaluate their arguments only if they are logging anything.


> libnm-core: fix documentation for dad-timeout property

part of this commit seem to belong to e previous commit.



>     The file is still called nm-arping-manager for lazyness, even if a
>     better name would be nm-acd-manager.

maybe let's do a follow-up commit that fixes the name?


nm_arping_manager_new()
»···priv->hwaddr = hwaddr;
»···priv->hwaddr_len = hwaddr_len;

You must clone the pointer (or otherwise ensure that the lifetime of the pointer is suitable. n_acd_start() as it is at the moment, requires that hwaddr_len is ETH_ALEN. Hence, you can do:

typedef struct {
    ...
»···const guint8 hwaddr[ETH_ALEN];
    ...
} NMArpingManagerPrivate;

and

nm_arping_manager_new():
»···g_return_val_if_fail (hwaddr, NULL);
»···g_return_val_if_fail (hwaddr_len == ETH_ALEN, NULL);

»···self = g_object_new (NM_TYPE_ARPING_MANAGER, NULL);
»···priv = NM_ARPING_MANAGER_GET_PRIVATE (self);
»···priv->ifindex = ifindex;
»···memcpy (&priv->hwaddr, hwaddr, hwaddr_len);

(I often like nm_assert(), but in this case, g_return_val_if_fail(), because the caller that passes invalid arguments is so "far" away. IMO this should also be checked in production code).



I wonder if it makes sense that NMArpingManager is a GObject. Wouldn't it be simpler, to have it just a plain struct, like https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/src/nm-checkpoint-manager.c?id=c4048d4d90fda904e4ae8e68e35cb547c53cfce3#n324 , and just give it a callback-function + user-data in nm_arping_manager_new()?

Comment 15 Beniamino Galvani 2018-04-05 14:08:52 UTC
(In reply to Thomas Haller from comment #14)
> could we only construct hwaddr_str if it's actually needed? The logging
> macros evaluate their arguments only if they are logging anything.

> part of this commit seem to belong to e previous commit.

> maybe let's do a follow-up commit that fixes the name?

> You must clone the pointer (or otherwise ensure that the lifetime of the
> pointer is suitable. n_acd_start() as it is at the moment, requires that
> hwaddr_len is ETH_ALEN. Hence, you can do:

All fixed.

> I wonder if it makes sense that NMArpingManager is a GObject. Wouldn't it be
> simpler, to have it just a plain struct, like
> https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/src/nm-
> checkpoint-manager.c?id=c4048d4d90fda904e4ae8e68e35cb547c53cfce3#n324 , and
> just give it a callback-function + user-data in nm_arping_manager_new()?

I don't find it's too much more complicated to have a GObject and in this way we can use signals for the probe terminated event instead of a callback.

Comment 16 Beniamino Galvani 2018-04-05 14:25:03 UTC
(In reply to Thomas Haller from comment #13)
> +src_libNetworkManager_la_CPPFLAGS = \
> +    $(src_cppflags) \
> +    -I$(srcdir)/lib \
> +    -I$(builddir)/lib
> 
> /lib, just like /shared seems something that should be available to every
> component. Especially, since one still needs to specify the directory like
> #include "nacd/...", I would add /lib as search path for everything (like
> shared).

/lib is not like shared, which contains general purpose utilities, it's for external libraries that we import as sources; therefore only target that need it should require it explicitly in my opinion.

Also, I think that adding /lib to the search path of every target in the makefile, even if it is currently needed by one, only adds noise.

Comment 17 Thomas Haller 2018-04-06 10:43:49 UTC
acd_event():
+    nm_utils_inet4_ntop (info->address, address_str);

this is only used for logging. Let's move it inside the logging macro.


Maybe DAD_TIMEOUT is not a good name, because it doesn't reflect that this is IPv4 only. IP4_DAD_TIMEOUT? Or maybe: the term DAD seems more related to IPv6, for IPv4 this is called ACD? How about ACD_TIMEOUT?


Finally, all of this is good, except, in my opinion, I would move the content of lib/ directory to shared/. These two directories have essentially the same purpose (as I understand it). They are internal code (linked statically), that is independent of the application that ends up using it. True, n-acd is only used by the core part in src/, but it is not tied to anything src/ specific.


Rest lgtm

Comment 18 Beniamino Galvani 2018-04-06 16:11:00 UTC
(In reply to Thomas Haller from comment #17)
> acd_event():
> +    nm_utils_inet4_ntop (info->address, address_str);
> 
> this is only used for logging. Let's move it inside the logging macro.

Ok.

> 
> 
> Maybe DAD_TIMEOUT is not a good name, because it doesn't reflect that this
> is IPv4 only. IP4_DAD_TIMEOUT? Or maybe: the term DAD seems more related to
> IPv6, for IPv4 this is called ACD? How about ACD_TIMEOUT?

Sounds good.

> Finally, all of this is good, except, in my opinion, I would move the
> content of lib/ directory to shared/. These two directories have essentially
> the same purpose (as I understand it). They are internal code (linked
> statically), that is independent of the application that ends up using it.
> True, n-acd is only used by the core part in src/, but it is not tied to
> anything src/ specific.

I moved the imported subprojects to shared/ now and dropped the existing c-list.h header. We could also replace the siphash24 code with c-siphash (to do).

Comment 19 Thomas Haller 2018-04-06 16:26:11 UTC
(In reply to Beniamino Galvani from comment #18)
> (In reply to Thomas Haller from comment #17)
> > Maybe DAD_TIMEOUT is not a good name, because it doesn't reflect that this
> > is IPv4 only. IP4_DAD_TIMEOUT? Or maybe: the term DAD seems more related to
> > IPv6, for IPv4 this is called ACD? How about ACD_TIMEOUT?
> 
> Sounds good.

The commit message still references the old name.


> build: autotools: link NM against n-acd

Most imported shared files are not dist-ed. That's probably correct. Can you once test that `make distcheck` passes? Just issue `./contrib/fedora/rpm/build_clean.sh` without arguments. I think c-list.h is wrongly not disted, didn't check in detail.


> core: rename 'arping' to 'acd'

Let's not drop the old file entirely from .gitignore. Move it to the graveyard section at the bottom of the file. It's useful when switching branches.

Comment 20 Thomas Haller 2018-04-06 16:55:54 UTC
> We could also replace the siphash24 code with c-siphash (to do).

yes, we could and probably should. However, systemd code still uses their version of siphash24, so, we anyway have both versions (unfortunately) -- unless, maybe we could move shared/nm-utils/siphash24.h to src/systemd/sd-adapt/, and use macros to re-define everything based on c-siphash.

Comment 21 Beniamino Galvani 2018-04-08 09:26:30 UTC
(In reply to Thomas Haller from comment #19)

> The commit message still references the old name.

Ops, fixed.

> > build: autotools: link NM against n-acd
> 
> Most imported shared files are not dist-ed. That's probably correct. Can you
> once test that `make distcheck` passes? Just issue
> `./contrib/fedora/rpm/build_clean.sh` without arguments. I think c-list.h is
> wrongly not disted, didn't check in detail.

It is disted:

  EXTRA_DIST += shared/c-list/src/c-list.h

and 'make distcheck' passes because all other source files are included in a _SOURCE variable.

> > core: rename 'arping' to 'acd'
> 
> Let's not drop the old file entirely from .gitignore. Move it to the
> graveyard section at the bottom of the file. It's useful when switching
> branches.

Done.

Comment 22 Thomas Haller 2018-04-08 15:52:33 UTC
lgtm now

Comment 30 Vladimir Benes 2018-08-22 08:28:08 UTC
DAD works well in normal situation and should be fixed above mentioned situations. Would it be possible to retest with the newest versions?

Comment 32 errata-xmlrpc 2018-10-30 11:11:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3207


Note You need to log in before you can comment on or make changes to this bug.