Bug 1789836 - OVN/DVR inbound Traffic to VIP flows through controllers after VM creation until VM is rebooted
Summary: OVN/DVR inbound Traffic to VIP flows through controllers after VM creation un...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Lucas Alvares Gomes
QA Contact: Roman Safronov
URL:
Whiteboard:
Depends On: 1839717 1839811
Blocks: 1830734
TreeView+ depends on / blocked
 
Reported: 2020-01-10 14:26 UTC by Andreas Karis
Modified: 2023-09-07 21:25 UTC (History)
13 users (show)

Fixed In Version: python-networking-ovn-4.0.4-3.el7ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1830734 (view as bug list)
Environment:
Last Closed: 2020-06-24 11:53:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1842988 0 None None None 2020-01-13 14:43:49 UTC
OpenStack gerrit 703813 0 None MERGED [OVN] Remove VLAN check when setting external_mac 2022-08-10 10:42:31 UTC
Red Hat Issue Tracker OSP-7695 0 None None None 2022-08-10 10:05:42 UTC
Red Hat Knowledge Base (Solution) 5049901 0 None None None 2020-05-05 15:08:11 UTC
Red Hat Product Errata RHBA-2020:2724 0 None None None 2020-06-24 11:53:36 UTC

Description Andreas Karis 2020-01-10 14:26:44 UTC
Description of problem:
OVN/DVR inbound Traffic to VIP flows through controllers after VM creation until VM is rebooted

This is a fresh installation of RHOSP 13 with OVN DVR enabled.

With OVN and DVR enabled, when we create a virtual machine and associate a floating ip to it, traffic from the outside towards the VIP flows through the OpenStack controllers instead of being directly routed on the compute node. When using tcpdump and listening for ARP requests, we can clearly see that controller-0 answers to ARP requests whereas the compute node does not respond. 
This behaviour changes after we reboot the virtual machine, the traffic is then correctly routed through the compute node.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 7 Andreas Karis 2020-01-13 14:03:49 UTC
I can reproduce this problem in a fresh OSP 13 z10 lab.  Note that my lab is a hybrid virtual / physical lab and instead of using a physical first hop gateway, I simply use the undercloud and added an interface to it.

---------------------------------------------------

When using neutron OVN-DVR: When deploying a new VM with nova and assigning a floating IP to it right away, OVN answers with the wrong MAC address (the one of the virtual router) to ARP requests. Due to this, traffic flows across the controllers. 
Clearing the ARP and FDB entries on the next-hop switch and router does not change this behavior. Only rebooting the nova instance corrects this issue.

(overcloud) [stack@undercloud-0 ~]$ nova boot --nic net-id=$NETID --image rhel --flavor m1.small --key-name id_rsa rhel-test3
+--------------------------------------+---------------------------------------------+
| Property                             | Value                                       |
+--------------------------------------+---------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                      |
| OS-EXT-AZ:availability_zone          |                                             |
| OS-EXT-SRV-ATTR:host                 | -                                           |
| OS-EXT-SRV-ATTR:hostname             | rhel-test3                                  |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | -                                           |
| OS-EXT-SRV-ATTR:instance_name        |                                             |
| OS-EXT-SRV-ATTR:kernel_id            |                                             |
| OS-EXT-SRV-ATTR:launch_index         | 0                                           |
| OS-EXT-SRV-ATTR:ramdisk_id           |                                             |
| OS-EXT-SRV-ATTR:reservation_id       | r-xv1d7hju                                  |
| OS-EXT-SRV-ATTR:root_device_name     | -                                           |
| OS-EXT-SRV-ATTR:user_data            | -                                           |
| OS-EXT-STS:power_state               | 0                                           |
| OS-EXT-STS:task_state                | scheduling                                  |
| OS-EXT-STS:vm_state                  | building                                    |
| OS-SRV-USG:launched_at               | -                                           |
| OS-SRV-USG:terminated_at             | -                                           |
| accessIPv4                           |                                             |
| accessIPv6                           |                                             |
| adminPass                            | j4tzbKXTnX9g                                |
| config_drive                         |                                             |
| created                              | 2020-01-13T13:27:34Z                        |
| description                          | -                                           |
| flavor:disk                          | 16                                          |
| flavor:ephemeral                     | 0                                           |
| flavor:extra_specs                   | {}                                          |
| flavor:original_name                 | m1.small                                    |
| flavor:ram                           | 1024                                        |
| flavor:swap                          | 0                                           |
| flavor:vcpus                         | 1                                           |
| hostId                               |                                             |
| host_status                          |                                             |
| id                                   | 8f3c9355-f488-49d0-9b2c-1169d4f88a90        |
| image                                | rhel (189f041d-4674-4361-ad64-3d4af8631ff5) |
| key_name                             | id_rsa                                      |
| locked                               | False                                       |
| metadata                             | {}                                          |
| name                                 | rhel-test3                                  |
| os-extended-volumes:volumes_attached | []                                          |
| progress                             | 0                                           |
| security_groups                      | default                                     |
| status                               | BUILD                                       |
| tags                                 | []                                          |
| tenant_id                            | 045b5511d89146aaaa5bfb171db73638            |
| updated                              | 2020-01-13T13:27:34Z                        |
| user_id                              | a6beb4d4d89c42b7b4ee8e612955e2f6            |
+--------------------------------------+---------------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ sleep 30 ; openstack server list --long | grep rhel-test3
| 8f3c9355-f488-49d0-9b2c-1169d4f88a90 | rhel-test3 | ACTIVE | None       | Running     | private2=192.168.1.108, 2000:192:168:1:f816:3eff:fe48:432f               | rhel       | 189f041d-4674-4361-ad64-3d4af8631ff5 | m1.small    | 7ef7c085-45f8-456d-ad03-760b1a739bbf | nova              | overcloud-compute-0.localdomain |    
(overcloud) [stack@undercloud-0 ~]$ openstack server add floating ip rhel-test3 172.31.0.210
(overcloud) [stack@undercloud-0 ~]$ openstack server list | grep rhel-test3
| 8f3c9355-f488-49d0-9b2c-1169d4f88a90 | rhel-test3 | ACTIVE | private2=192.168.1.108, 2000:192:168:1:f816:3eff:fe48:432f, 172.31.0.210 | rhel  | m1.small |

The floating IP address' MAC address in this example:

(overcloud) [stack@undercloud-0 ~]$ neutron port-list | grep 172.31.0.210
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
| c7988d2c-4d3f-4d1c-ade6-223986c70d1c |      |                                  | fa:16:3e:9e:96:da | {"subnet_id": "33b9ac04-dac5-40cd-a6e7-90e5cd51d15d", "ip_address": "172.31.0.210"}                        |

This MAC address can also be found in the NAT NB table as type dnat_and_snat:

[root@overcloud-controller-0 ~]# ovn-nbctl find NAT type=dnat_and_snat
(...)
_uuid               : a45c5c10-847a-4d07-900c-eac0589cd4c7
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:9e:96:da", "neutron:fip_id"="cdeb3dc7-fdc8-493b-9536-edb2163c3d1c", "neutron:fip_port_id"="f2cfccb9-eaf0-4448-8234-af5cc08c1469", "neutron:revision_number"="6", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.210"
external_mac        : []
logical_ip          : "192.168.1.102"
logical_port        : "f2cfccb9-eaf0-4448-8234-af5cc08c1469"
type                : dnat_and_snat

Instead, the ARP reply comes from controller-0 and with the controller's virtual router's MAC address. And this can re reproduced even after deleting the ARP entry from the first-hop gateway:

(overcloud) [stack@undercloud-0 ~]$ ip neigh ls | grep 172.31.0.210
(overcloud) [stack@undercloud-0 ~]$ ping 172.31.0.210 -c1 -W1
PING 172.31.0.210 (172.31.0.210) 56(84) bytes of data.
64 bytes from 172.31.0.210: icmp_seq=1 ttl=63 time=2.00 ms

--- 172.31.0.210 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.001/2.001/2.001/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ip neigh ls | grep 172.31.0.210
172.31.0.210 dev eth1.109 lladdr fa:16:3e:71:c3:91 STALE
(overcloud) [stack@undercloud-0 ~]$ sudo ip neigh del 172.31.0.210 dev eth1.109 
(overcloud) [stack@undercloud-0 ~]$ ping 172.31.0.210 -c1 -W1
PING 172.31.0.210 (172.31.0.210) 56(84) bytes of data.
64 bytes from 172.31.0.210: icmp_seq=1 ttl=63 time=2.61 ms

--- 172.31.0.210 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.611/2.611/2.611/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ip neigh ls | grep 172.31.0.210
172.31.0.210 dev eth1.109 lladdr fa:16:3e:71:c3:91 REACHABLE

The answer with the wrong MAC comes from controller-0 and DNAT is not distributed. Instead, all traffic goes through controller-0:

[root@overcloud-controller-0 ~]# tcpdump -nne -i eth3 -l | egrep '172.31.0.212|172.31.0.201|172.31.0.217|172.31.0.210'
(...)
13:30:36.674640 52:54:00:bf:31:d7 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 109, p 0, ethertype ARP, Request who-has 172.31.0.210 tell 172.31.0.1, length 28
13:30:36.675040 fa:16:3e:71:c3:91 > 52:54:00:bf:31:d7, ethertype 802.1Q (0x8100), length 46: vlan 109, p 0, ethertype ARP, Reply 172.31.0.210 is-at fa:16:3e:71:c3:91, length 28
13:30:36.675138 52:54:00:bf:31:d7 > fa:16:3e:71:c3:91, ethertype 802.1Q (0x8100), length 102: vlan 109, p 0, ethertype IPv4, 172.31.0.1 > 172.31.0.210: ICMP echo request, id 5897, seq 1, length 64
13:30:36.676382 fa:16:3e:71:c3:91 > 52:54:00:bf:31:d7, ethertype 802.1Q (0x8100), length 102: vlan 109, p 0, ethertype IPv4, 172.31.0.210 > 172.31.0.1: ICMP echo reply, id 5897, seq 1, length 64
(...) 

We can see the correlation between this MAC address and router names "router" here:

[root@overcloud-controller-0 ~]# ovn-nbctl show | grep fa:16:3e:71:c3:91 -B2
    port 1a747fa2-abee-4e71-8888-a12c05199ac8
        type: router
        addresses: ["fa:16:3e:71:c3:91"]
--
        networks: ["192.168.0.1/24"]
    port lrp-1a747fa2-abee-4e71-8888-a12c05199ac8
        mac: "fa:16:3e:71:c3:91"

And we can list all ports and NAT configuration like this:

[root@overcloud-controller-0 ~]# ovn-nbctl show neutron-e75c5fb4-29ee-450d-840f-a911c9896256
router 495dc55c-cc18-4671-b12a-4356501ebd54 (neutron-e75c5fb4-29ee-450d-840f-a911c9896256) (aka router)
    port lrp-4c37ace1-fb36-4b52-8bf6-6579370390c1
        mac: "fa:16:3e:b6:20:bc"
        networks: ["192.168.0.1/24"]
    port lrp-1a747fa2-abee-4e71-8888-a12c05199ac8
        mac: "fa:16:3e:71:c3:91"
        networks: ["172.31.0.212/24", "2000:10::f816:3eff:fe71:c391/64"]
        gateway chassis: [87dd8e80-0d73-4f18-819a-48550f793a48 4f28cab8-8722-4fc3-b7c1-346535f293a4 fe02f9da-ce8f-4588-b7db-8d728101ce71]
    port lrp-c3278072-be80-4ce6-b4e0-e8bff1a6d58f
        mac: "fa:16:3e:f6:26:bd"
        networks: ["192.168.10.1/24"]
    port lrp-553ba59a-6076-4c18-824f-fd60a99618a5
        mac: "fa:16:3e:de:e1:48"
        networks: ["192.168.1.1/24"]
    nat 094c26a4-09db-4c8e-8d78-cf9ddd19f390
        external ip: "172.31.0.212"
        logical ip: "192.168.1.0/24"
        type: "snat"
    nat 61b357c0-1c06-4c22-b0e5-50ef64813cc8
        external ip: "172.31.0.201"
        logical ip: "192.168.0.101"
        type: "dnat_and_snat"
    nat 70a50836-3aa8-4460-a242-66ad4ce61a0f
        external ip: "172.31.0.212"
        logical ip: "192.168.0.0/24"
        type: "snat"
    nat a45c5c10-847a-4d07-900c-eac0589cd4c7
        external ip: "172.31.0.210"
        logical ip: "192.168.1.102"
        type: "dnat_and_snat"
    nat b0782824-00e9-4b66-bc4b-9d54cf8ec737
        external ip: "172.31.0.217"
        logical ip: "192.168.0.103"
        type: "dnat_and_snat"
    nat f0943a20-c7d7-494d-97eb-10d43f30accc
        external ip: "172.31.0.212"
        logical ip: "192.168.10.0/24"
        type: "snat"

This behavior only changes then the VM is rebooted:

(overcloud) [stack@undercloud-0 ~]$ nova reboot rhel-test3
Request to reboot server <Server: rhel-test3> has been accepted.
(overcloud) [stack@undercloud-0 ~]$ slee 120

 [stack@undercloud-0 ~]$ ip neigh ls | grep 172.31.0.210
172.31.0.210 dev eth1.109 lladdr fa:16:3e:9e:96:da REACHABLE

And we can see that the compute hosting the VM answers with the correct MAC address:

[root@overcloud-compute-0 ~]#  tcpdump -nne -i p2p2 -l | egrep '172.31.0.212|172.31.0.201|172.31.0.217|172.31.0.210'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on p2p2, link-type EN10MB (Ethernet), capture size 262144 bytes
13:35:04.116296 52:54:00:bf:31:d7 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 109, p 0, ethertype ARP, Request who-has 172.31.0.210 tell 172.31.0.1, length 46
13:35:04.116532 fa:16:3e:9e:96:da > 52:54:00:bf:31:d7, ethertype 802.1Q (0x8100), length 64: vlan 109, p 0, ethertype ARP, Reply 172.31.0.210 is-at fa:16:3e:9e:96:da, length 46
13:35:04.116620 52:54:00:bf:31:d7 > fa:16:3e:9e:96:da, ethertype 802.1Q (0x8100), length 102: vlan 109, p 0, ethertype IPv4, 172.31.0.1 > 172.31.0.210: ICMP echo request, id 6571, seq 1, length 64
(...)

(overcloud) [stack@undercloud-0 ~]$ ip neigh ls | grep 172.31.0.210
172.31.0.210 dev eth1.109 lladdr fa:16:3e:9e:96:da STALE

Comment 8 Andreas Karis 2020-01-13 14:10:06 UTC
Note that we can see the same behavior in the customer's environment:

Before instance reboot: https://bugzilla.redhat.com/attachment.cgi?id=1651310

Controller-0 answers to ARP request for 185.155.92.2 with fa:16:3e:60:9c:ec  ; traffic then flows through controller-0 between 00:1c:73:00:09:99 and fa:15:3e:60:9c:ec

After instance reboot: https://bugzilla.redhat.com/attachment.cgi?id=1651315

A node answers to ARP request for 185.155.92.2 with fa:16:e3:ac:0a:53 (we don't see the ARP reply in the output, but it doesn't matter as we know from the ICMP packets that this is happening) ; traffic then flows through Compute-di504233 between  00:1c:73:00:09:99 and fa:16:3e:ac:0a:53

Comment 10 Andreas Karis 2020-01-13 14:43:25 UTC
Note that this looks as if this was upstream: https://bugs.launchpad.net/tripleo/+bug/1842988

Comment 11 Andreas Karis 2020-01-13 14:49:31 UTC
And from my lab, here are steps to reproduce and verify this:

Create server without FIP:
~~~
(overcloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+------------+--------+------------+-------------+--------------------------------------------------------------------------+
| ID                                   | Name       | Status | Task State | Power State | Networks                                                                 |
+--------------------------------------+------------+--------+------------+-------------+--------------------------------------------------------------------------+
| ac73af43-c22e-4e5b-be90-f2bc79b30ca4 | rhel-test1 | ACTIVE | -          | Running     | private1=2000:192:168:0:f816:3eff:fe3f:9ee4, 192.168.0.101, 172.31.0.201 |
| f8dccb8d-b39a-4231-b2f1-46dae48744b4 | rhel-test2 | ACTIVE | -          | Running     | private1=2000:192:168:0:f816:3eff:fe00:bcd6, 192.168.0.103, 172.31.0.217 |
| 1cd9d144-9951-4bff-8b97-3a6e02b79467 | rhel-test3 | ACTIVE | -          | Running     | private2=192.168.1.101, 2000:192:168:1:f816:3eff:feb7:5d7d               |
+--------------------------------------+------------+--------+------------+-------------+--------------------------------------------------------------------------+
~~~

Check NAT rules:
~~~
[root@overcloud-controller-0 ~]# export SB=$(sudo ovs-vsctl get open . external_ids:ovn-remote | sed -e 's/\"//g')
[root@overcloud-controller-0 ~]# export NB=$(sudo ovs-vsctl get open . external_ids:ovn-remote | sed -e 's/\"//g' | sed -e 's/6642/6641/g')
[root@overcloud-controller-0 ~]# alias ovn-sbctl='sudo docker exec ovn_controller ovn-sbctl --db=$SB'
[root@overcloud-controller-0 ~]# alias ovn-nbctl='sudo docker exec ovn_controller ovn-nbctl --db=$NB'
[root@overcloud-controller-0 ~]# alias ovn-trace='sudo docker exec ovn_controller ovn-trace --db=$SB'
[root@overcloud-controller-0 ~]# ovn-nbctl find NAT type=dnat_and_snat
_uuid               : 61b357c0-1c06-4c22-b0e5-50ef64813cc8
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:ee:4b:3b", "neutron:fip_id"="b0c03864-8c79-43b3-9240-370cfbcde904", "neutron:fip_port_id"="b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.201"
external_mac        : "fa:16:3e:ee:4b:3b"
logical_ip          : "192.168.0.101"
logical_port        : "b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba"
type                : dnat_and_snat

_uuid               : b0782824-00e9-4b66-bc4b-9d54cf8ec737
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:8b:04:4f", "neutron:fip_id"="4fc2b87a-a188-4468-ad40-8b03ed26b56d", "neutron:fip_port_id"="581de757-7b2e-4995-aa84-3c4bc58edd4d", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.217"
external_mac        : "fa:16:3e:8b:04:4f"
logical_ip          : "192.168.0.103"
logical_port        : "581de757-7b2e-4995-aa84-3c4bc58edd4d"
type                : dnat_and_snat
[root@overcloud-controller-0 ~]# 
~~~

Add FIP:
~~~
(overcloud) [stack@undercloud-0 ~]$ openstack server add floating ip rhel-test3 172.31.0.210
(overcloud) [stack@undercloud-0 ~]$ 
~~~

Check NAT rules:
~~~
[root@overcloud-controller-0 ~]# ovn-nbctl find NAT type=dnat_and_snat
_uuid               : 61b357c0-1c06-4c22-b0e5-50ef64813cc8
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:ee:4b:3b", "neutron:fip_id"="b0c03864-8c79-43b3-9240-370cfbcde904", "neutron:fip_port_id"="b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.201"
external_mac        : "fa:16:3e:ee:4b:3b"
logical_ip          : "192.168.0.101"
logical_port        : "b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba"
type                : dnat_and_snat

_uuid               : b0782824-00e9-4b66-bc4b-9d54cf8ec737
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:8b:04:4f", "neutron:fip_id"="4fc2b87a-a188-4468-ad40-8b03ed26b56d", "neutron:fip_port_id"="581de757-7b2e-4995-aa84-3c4bc58edd4d", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.217"
external_mac        : "fa:16:3e:8b:04:4f"
logical_ip          : "192.168.0.103"
logical_port        : "581de757-7b2e-4995-aa84-3c4bc58edd4d"
type                : dnat_and_snat

_uuid               : f406e2aa-697d-4a44-a078-ac50f5699202
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:9e:96:da", "neutron:fip_id"="cdeb3dc7-fdc8-493b-9536-edb2163c3d1c", "neutron:fip_port_id"="79f8f865-30c7-48ab-9017-76ce6c007c1b", "neutron:revision_number"="18", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.210"
external_mac        : []
logical_ip          : "192.168.1.101"
logical_port        : "79f8f865-30c7-48ab-9017-76ce6c007c1b"
type                : dnat_and_snat
[root@overcloud-controller-0 ~]# 
~~~

Reboot server:
~~~
(overcloud) [stack@undercloud-0 ~]$ nova reboot rhel-test3
Request to reboot server <Server: rhel-test3> has been accepted.
(overcloud) [stack@undercloud-0 ~]$ 
~~~

Check NAT rules:
~~~
[root@overcloud-controller-0 ~]# ovn-nbctl find NAT type=dnat_and_snat
_uuid               : 61b357c0-1c06-4c22-b0e5-50ef64813cc8
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:ee:4b:3b", "neutron:fip_id"="b0c03864-8c79-43b3-9240-370cfbcde904", "neutron:fip_port_id"="b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.201"
external_mac        : "fa:16:3e:ee:4b:3b"
logical_ip          : "192.168.0.101"
logical_port        : "b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba"
type                : dnat_and_snat

_uuid               : b0782824-00e9-4b66-bc4b-9d54cf8ec737
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:8b:04:4f", "neutron:fip_id"="4fc2b87a-a188-4468-ad40-8b03ed26b56d", "neutron:fip_port_id"="581de757-7b2e-4995-aa84-3c4bc58edd4d", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.217"
external_mac        : "fa:16:3e:8b:04:4f"
logical_ip          : "192.168.0.103"
logical_port        : "581de757-7b2e-4995-aa84-3c4bc58edd4d"
type                : dnat_and_snat

_uuid               : f406e2aa-697d-4a44-a078-ac50f5699202
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:9e:96:da", "neutron:fip_id"="cdeb3dc7-fdc8-493b-9536-edb2163c3d1c", "neutron:fip_port_id"="79f8f865-30c7-48ab-9017-76ce6c007c1b", "neutron:revision_number"="18", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.210"
external_mac        : "fa:16:3e:9e:96:da"
logical_ip          : "192.168.1.101"
logical_port        : "79f8f865-30c7-48ab-9017-76ce6c007c1b"
type                : dnat_and_snat
~~~

Comment 12 Andreas Karis 2020-01-13 14:54:59 UTC
The issue can also be recreated by detaching and reattaching the FIP:
~~~
(overcloud) [stack@undercloud-0 ~]$ openstack server remove floating ip rhel-test3 172.31.0.210
(overcloud) [stack@undercloud-0 ~]$ openstack server add floating ip rhel-test3 172.31.0.210
(overcloud) [stack@undercloud-0 ~]$ 
~~~

~~~
[root@overcloud-controller-0 ~]#  ovn-nbctl find NAT type=dnat_and_snat
_uuid               : 61b357c0-1c06-4c22-b0e5-50ef64813cc8
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:ee:4b:3b", "neutron:fip_id"="b0c03864-8c79-43b3-9240-370cfbcde904", "neutron:fip_port_id"="b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.201"
external_mac        : "fa:16:3e:ee:4b:3b"
logical_ip          : "192.168.0.101"
logical_port        : "b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba"
type                : dnat_and_snat

_uuid               : b0782824-00e9-4b66-bc4b-9d54cf8ec737
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:8b:04:4f", "neutron:fip_id"="4fc2b87a-a188-4468-ad40-8b03ed26b56d", "neutron:fip_port_id"="581de757-7b2e-4995-aa84-3c4bc58edd4d", "neutron:revision_number"="2", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.217"
external_mac        : "fa:16:3e:8b:04:4f"
logical_ip          : "192.168.0.103"
logical_port        : "581de757-7b2e-4995-aa84-3c4bc58edd4d"
type                : dnat_and_snat

_uuid               : 2b58f203-5560-480d-869f-66ddb0d97ed8
external_ids        : {"neutron:fip_external_mac"="fa:16:3e:9e:96:da", "neutron:fip_id"="cdeb3dc7-fdc8-493b-9536-edb2163c3d1c", "neutron:fip_port_id"="79f8f865-30c7-48ab-9017-76ce6c007c1b", "neutron:revision_number"="22", "neutron:router_name"="neutron-e75c5fb4-29ee-450d-840f-a911c9896256"}
external_ip         : "172.31.0.210"
external_mac        : []
logical_ip          : "192.168.1.101"
logical_port        : "79f8f865-30c7-48ab-9017-76ce6c007c1b"
type                : dnat_and_snat
[root@overcloud-controller-0 ~]# ovn-nbctl lr-nat-list  neutron-e75c5fb4-29ee-450d-840f-a911c9896256
TYPE             EXTERNAL_IP        LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
dnat_and_snat    172.31.0.201       192.168.0.101         fa:16:3e:ee:4b:3b    b7e9fb53-db4e-4b3e-a6e4-2acbbc4ba9ba
dnat_and_snat    172.31.0.210       192.168.1.101
dnat_and_snat    172.31.0.217       192.168.0.103         fa:16:3e:8b:04:4f    581de757-7b2e-4995-aa84-3c4bc58edd4d
snat             172.31.0.212       192.168.10.0/24
snat             172.31.0.212       192.168.0.0/24
snat             172.31.0.212       192.168.1.0/24
~~~

Comment 14 Andreas Karis 2020-01-19 08:55:14 UTC
Hi Assaf,

When are you triaging meetings to assign this case to an engineer?

Thanks,

Andreas

Comment 27 Andreas Karis 2020-01-31 10:08:59 UTC
Hi Lucas,

IT had to move the lab so it's currently powered down. 

I'd like to know if a backport to OSP 13, OSP 15 and OSP 16 will be feasible. If not, please let me know and I'll communicate that to the customer.

Kind regards,

Andreas

Comment 28 Lucas Alvares Gomes 2020-01-31 10:19:27 UTC
(In reply to Andreas Karis from comment #27)
> Hi Lucas,
> 
> IT had to move the lab so it's currently powered down. 
> 
> I'd like to know if a backport to OSP 13, OSP 15 and OSP 16 will be
> feasible. If not, please let me know and I'll communicate that to the
> customer.
> 
> Kind regards,
> 
> Andreas

Hi Andreas,

The patch [0] currently has a +2 vote upstream. I will see if I can get someone else to review it today and perhaps approve/merge it.

Once it's merged, I'll backport it all the way down to OSP 13.

[0] https://review.opendev.org/#/c/703813/

Comment 29 Andreas Karis 2020-02-05 08:50:21 UTC
Thanks!

Comment 30 Andreas Karis 2020-02-20 13:49:15 UTC
Hi,

How is the work on upstream going? :-)

- Andreas

Comment 31 Lucas Alvares Gomes 2020-02-20 15:04:48 UTC
(In reply to Andreas Karis from comment #30)
> Hi,
> 
> How is the work on upstream going? :-)
> 
> - Andreas

Hi Andres,

It was merged upstream on the master branch but not yet on the stable/queens (OSP 13) branch upstream [0].

Regardless of that, I've just proposed the fix directly to the OSP 13 branch at [1], will see if I can get some reviews there.

[0] https://review.opendev.org/#/c/705253
[1] https://code.engineering.redhat.com/gerrit/#/c/192476/

Comment 42 Roman Safronov 2020-05-25 16:18:02 UTC
Can not be verified until https://bugzilla.redhat.com/show_bug.cgi?id=1839811 is fixed.

Comment 43 Roman Safronov 2020-06-03 08:21:10 UTC
Verified on 13.0-RHEL-7/2020-05-28.2 with python-networking-ovn-4.0.4-6.el7ost.noarch
Created environment with external network of type vlan (the network where FIPs are allocated), created a router connected to the network and launched an instance on internal network connected to the router.
Verified that when setting a fip for the port external_mac is set and accessible.
Verified also after deleting/recreating fip and also after stop/start the instance. In all cases external_mac was present when needed and accessible from external network. Traffic passed through compute node.
Executed also automated ovn dvr tests which verify dvr traffic flow ingress/egress/fip2fip and delete/recreate fip and they all passed.

Comment 46 errata-xmlrpc 2020-06-24 11:53:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2724


Note You need to log in before you can comment on or make changes to this bug.