Bug 1465161 - Seeing ipv6 duplicate address, causing network issues
Seeing ipv6 duplicate address, causing network issues
Status: MODIFIED
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
10.0 (Newton)
Unspecified Linux
high Severity high
: ---
: 10.0 (Newton)
Assigned To: Daniel Alvarez Sanchez
Toni Freger
: Triaged, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-26 16:37 EDT by rlopez
Modified: 2017-09-19 03:29 EDT (History)
6 users (show)

See Also:
Fixed In Version: openstack-neutron-9.3.1-9.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description rlopez 2017-06-26 16:37:07 EDT
I have an OSP10 environment running on the latest bits 3 controller, 3 compute , 3 existing ceph storage. 

Currently working on creating OCP3.4 heat templates for OSP10. When running the heat templates , /var/log/messages reports the following:


Jun 26 19:21:54 overcloud-controller-0 kernel: IPv6: qg-d5aa7c20-41: IPv6 duplicate address 2620:52:0:1372:f816:3eff:fe97:560 detected!

When this happens usually there is a freeze in accessing the instances and also at times this has caused our heat stack that installs OCP to fail. The OCP stack does not use ipv6 and nor does my OSP env use IPv6 (to my knowledge) so not sure why I'm seeing this message.


Info from controller:

# cat /etc/sysctl.conf | grep ipv6
net.ipv6.conf.default.autoconf=0
net.ipv6.conf.default.accept_ra=0
net.ipv6.conf.all.autoconf=0
net.ipv6.conf.all.accept_ra=0



# ip netns exec qrouter-43fc42e1-f6ee-40bf-9657-1f463ab5f901 ip a


1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
58: ha-5b4d3b49-e2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:b6:c4:a1 brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.9/18 brd 169.254.255.255 scope global ha-5b4d3b49-e2
       valid_lft forever preferred_lft forever
    inet 169.254.0.2/24 scope global ha-5b4d3b49-e2
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:feb6:c4a1/64 scope link 
       valid_lft forever preferred_lft forever
59: qg-d5aa7c20-41: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1496 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:97:05:60 brd ff:ff:ff:ff:ff:ff
    inet 10.19.114.187/23 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.198/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.199/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.200/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.207/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.211/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.213/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe97:560/64 scope link nodad 
       valid_lft forever preferred_lft forever
60: qr-e0eb81e9-5d: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:bc:41:15 brd ff:ff:ff:ff:ff:ff
    inet 172.22.10.1/24 scope global qr-e0eb81e9-5d
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:febc:4115/64 scope link nodad 
       valid_lft forever preferred_lft forever


Via director node using overcloudrc, I see the router that uses that ID. 


(just a snippet)
[stack@osp10-pit-director ~]$ neutron router-list
+--------------------------------------+-----------------------------------+---------------------------------------------------------------+-------------+------+
| id                                   | name                              | external_gateway_info                                         | distributed | ha   |
+--------------------------------------+-----------------------------------+---------------------------------------------------------------+-------------+------+                                          |             |      |
| 43fc42e1-f6ee-40bf-9657-1f463ab5f901 | test-external_router-w2puzwdzxmwp | {"network_id": "084884f9-d9d2-477a-bae7-26dbb4ff1873",        | False       | True |
|                                      |                                   | "enable_snat": true, "external_fixed_ips": [{"subnet_id":     |             |      |
|                                      |                                   | "732844a3-7196-4fac-a75a-cdfca872462e", "ip_address":         |             |      |
|                                      |                                   | "10.19.114.187"}]} 



From controller:

# ovs-vsctl show
5150acbd-78b0-4daa-868d-71e44f6dd898
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port br-int
            Interface br-int
                type: internal
        Port "ha-5b4d3b49-e2"
            tag: 19
            Interface "ha-5b4d3b49-e2"
                type: internal
        Port "ha-3767b6fe-e0"
            tag: 1
            Interface "ha-3767b6fe-e0"
                type: internal
        Port "tap7e11f374-25"
            tag: 4
            Interface "tap7e11f374-25"
                type: internal
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port "qr-607a1668-c9"
            tag: 7
            Interface "qr-607a1668-c9"
                type: internal
        Port "ha-90061261-bf"
            tag: 9
            Interface "ha-90061261-bf"
                type: internal
        Port "ha-5801926f-08"
            tag: 3
            Interface "ha-5801926f-08"
                type: internal
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port "qr-2dff3443-6a"
            tag: 2
            Interface "qr-2dff3443-6a"
                type: internal
        Port "qr-72a5264b-47"
            tag: 20
            Interface "qr-72a5264b-47"
                type: internal
        Port "tap1964f1af-24"
            tag: 2
            Interface "tap1964f1af-24"
                type: internal
        Port "qr-d468c597-ab"
            tag: 8
            Interface "qr-d468c597-ab"
                type: internal
        Port "qr-cbde6da8-d4"
            tag: 6
            Interface "qr-cbde6da8-d4"
                type: internal
        Port "qr-e0eb81e9-5d"
            tag: 21
            Interface "qr-e0eb81e9-5d"
                type: internal
        Port "qr-e6cd86b9-8d"
            tag: 10
            Interface "qr-e6cd86b9-8d"
                type: internal
        Port "qr-f21200a3-1f"
            tag: 4
            Interface "qr-f21200a3-1f"
                type: internal
        Port "ha-98fb8c11-81"
            tag: 19
            Interface "ha-98fb8c11-81"
                type: internal
        Port "tap1c3e217e-16"
            tag: 6
            Interface "tap1c3e217e-16"
                type: internal
    Bridge br-tun
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "vxlan-ac100416"
            Interface "vxlan-ac100416"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.4.16", out_key=flow, remote_ip="172.16.4.22"}
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port "vxlan-ac100411"
            Interface "vxlan-ac100411"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.4.16", out_key=flow, remote_ip="172.16.4.17"}
        Port "vxlan-ac10040f"
            Interface "vxlan-ac10040f"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.4.16", out_key=flow, remote_ip="172.16.4.15"}
        Port "vxlan-ac100414"
            Interface "vxlan-ac100414"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.4.16", out_key=flow, remote_ip="172.16.4.20"}
        Port "vxlan-ac10040b"
            Interface "vxlan-ac10040b"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.4.16", out_key=flow, remote_ip="172.16.4.11"}
        Port br-tun
            Interface br-tun
                type: internal
    Bridge br-tenant
        fail_mode: standalone
        Port "p3p1"
            Interface "p3p1"
        Port br-tenant
            Interface br-tenant
                type: internal
    Bridge br-ex
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "qg-5af7d801-3c"
            Interface "qg-5af7d801-3c"
                type: internal
        Port "qg-a2db91ba-78"
            Interface "qg-a2db91ba-78"
                type: internal
        Port "qg-d5aa7c20-41"
            Interface "qg-d5aa7c20-41"
                type: internal
        Port "qg-d13b53fc-b3"
            Interface "qg-d13b53fc-b3"
                type: internal
        Port "qg-2b45ac4d-a1"
            Interface "qg-2b45ac4d-a1"
                type: internal
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port br-ex
            Interface br-ex
                type: internal
        Port "em3"
            Interface "em3"
    ovs_version: "2.5.0"


Let me know if you need anything else, this is stopping us from shipping out the ocp3.4 heat template rpm.
Comment 1 rlopez 2017-06-27 10:33:55 EDT
Not sure if I'm running into something like: https://bugs.launchpad.net/neutron/+bug/1459856


Also found this (older): https://bugs.launchpad.net/nova/+bug/1011134/comments/2

FYI: I've gone into every osp instance regarding this stack and disabled ipv6 within sysctl.conf as follows and rebooted. 

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

I STILL see the dups :-/
Comment 2 rlopez 2017-06-27 11:13:24 EDT
The duplicate and the interface it is complaining about:

Jun 27 14:33:56 overcloud-controller-0 kernel: IPv6: qg-d5aa7c20-41: IPv6 duplicate address 2620:52:0:1372:f816:3eff:fe97:560 detected!

Look at the netns of the router running that qg-d5aa7c20-41

59: qg-d5aa7c20-41: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1496 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:97:05:60 brd ff:ff:ff:ff:ff:ff
    inet 10.19.114.187/23 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.198/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.199/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.200/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.207/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.211/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.213/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet 10.19.114.188/32 scope global qg-d5aa7c20-41
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe97:560/64 scope link nodad 
       valid_lft forever preferred_lft forever
60: qr-e0eb81e9-5d: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:bc:41:15 brd ff:ff:ff:ff:ff:ff
    inet 172.22.10.1/24 scope global qr-e0eb81e9-5d
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:febc:4115/64 scope link nodad 
       valid_lft forever preferred_lft forever


The only one that has a inet6 is an instance labeled test-devs which on the system itself has the ipv6 disabled but for some reason the inet6 exists on the qg but not within the instance itself there are no inet6

[cloud-user@test-devs ~]$ sudo -i
[root@test-devs ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:5c:e5:5e brd ff:ff:ff:ff:ff:ff
    inet 172.22.10.10/24 brd 172.22.10.255 scope global dynamic eth0
       valid_lft 86288sec preferred_lft 86288sec

[root@test-devs ~]# cat /etc/sysctl.conf 
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
[root@test-devs ~]# cat /proc/sys/net/ipv6/conf/default/disable_ipv6
1

When I reboot this system, or shut it off, the duplicate address go quiet. As soon as its back up, the dup messages come back up.
Comment 3 rlopez 2017-06-27 11:28:39 EDT
Disabling ipv6 via the netns removes the duplicate:

 ip netns exec qrouter-43fc42e1-f6ee-40bf-9657-1f463ab5f901 sysctl -w net.ipv6.conf.qr-e0eb81e9-5d.disable_ipv6=1

I went ahead and did for all

 ip netns exec qrouter-43fc42e1-f6ee-40bf-9657-1f463ab5f901 sysctl -w net.ipv6.conf.all.disable_ipv6=1

However, the above seems like a total hack. Why does an inet6 get linked if its not being used to begin with?
Comment 4 rlopez 2017-06-27 11:34:07 EDT
More info that might be useful: https://bugs.launchpad.net/mos/+bug/1596846
Comment 6 rlopez 2017-07-11 10:35:21 EDT
Can I get a response please?
Comment 7 Assaf Muller 2017-08-07 09:41:01 EDT
Assigned to Daniel for triage.
Comment 8 Daniel Alvarez Sanchez 2017-08-07 11:10:01 EDT
Could this be a duplicate of [0]?
If so, it would be fixed in openstack-neutron-9.3.1-9.el7ost

We could confirm by capturing traffic in the controllers. However, if it's easy to replicate, I would try that version since it looks very similar and disabling ipv6 forwarding on the backup instance could possibly solve it.
Or at least:

ip netns exec qrouter-43fc42e1-f6ee-40bf-9657-1f463ab5f901 sysctl -w net.ipv6.conf.all.forwarding=0

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1426735
Comment 9 rlopez 2017-08-18 11:54:28 EDT
Hi Daniel,

Thanks for your reply. 

Question:

Can I easily implement this by just upgrading my controllers with that specific RPM package and restarting neutron?

Would I always require having to disable forwarding for every router that is created?

Any idea why ipv4 and ipv6 are not happy with each other when both enabled? Especially since I'm not even using the ipv6...
Comment 10 Daniel Alvarez Sanchez 2017-08-20 13:39:48 EDT
(In reply to rlopez from comment #9)
> Hi Daniel,
> 
> Thanks for your reply. 
> 
> Question:
> 
> Can I easily implement this by just upgrading my controllers with that
> specific RPM package and restarting neutron?

Yes, that should be fine.
> 
> Would I always require having to disable forwarding for every router that is
> created?

The RPM package includes the patch that does it automatically every time
a failover occurs.

> 
> Any idea why ipv4 and ipv6 are not happy with each other when both enabled?
> Especially since I'm not even using the ipv6...

In this case, as the interface has ipv6 forwarding enabled, it will automatically
get subscribed to several multicast groups. Therefore, when multicast traffic
is received, it will respond to the ToR switch and this will learn the MAC address from the backup node on its port, disrupting traffic to the master.

Please, note that I'll be away for the next two weeks and I won't be able to look into this case until then. Sorry for the inconvenience.

Note You need to log in before you can comment on or make changes to this bug.