Bug 2052243

Summary: binding_host_id and and status not set on virtual IPs
Product: Red Hat OpenStack Reporter: François Rigault <frigo>
Component: openstack-neutronAssignee: Rodolfo Alonso <ralonsoh>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: low Docs Contact:
Priority: low    
Version: 17.1 (Wallaby)CC: apevec, astupnik, chrisw, egarciar, gbrinn, lhh, ltamagno, majopela, ralonsoh, scohen
Target Milestone: z2Keywords: Triaged
Target Release: 17.1   
Hardware: noarch   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-neutron-18.6.1-1.20230518200974.el9ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2102261 (view as bug list) Environment:
Last Closed: 2024-01-16 14:31:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2102261    
Attachments:
Description Flags
set binding host on vips none

Description François Rigault 2022-02-08 22:51:31 UTC
Created attachment 1859934 [details]
set binding host on vips

Description of problem:
running `openstack port show` on a vip that is currently owned by a VM:
the port shows as DOWN, unbound to any host, while it should says ACTIVE and bound to a host.


Version-Release number of selected component (if applicable):
python3-networking-ovn-7.4.2-2.20210601204831.el8ost.13.noarch.rpm

How reproducible:
all the time for the binding-host, seems intermittent for the DOWN vs ACTIVE port.

Steps to Reproduce:
1. define a port for our server, and allow it to use a vip:
openstack port create --network private --security-group prodlike serverport
openstack server create --security-group prodlike --port serverport \
   --key-name stack --flavor m1.small --image cirros myserver \
   --availability-zone nova:cpu35d
openstack floating ip set --port serverport 10.64.154.3
openstack port create myvip  --network private \
   --fixed-ip ip-address=192.168.200.20
openstack port set --allowed-address ip-address=192.168.200.20 serverport

at this moment, ovn-sbctl should not show this port on any chassis:
$ ovn-sbctl show
Chassis "1126ea9a-2860-4e5c-9ab5-ca1e8959edee"
    hostname: cpu35d
    Encap geneve
        ip: "10.64.145.100"
        options: {csum="true"}
    Port_Binding cr-lrp-52d45a86-6cbf-43ff-9700-755490192441
    Port_Binding "d4aafa35-ab96-4451-9623-983a164f28dd"

The port is not bound yet:
$ openstack port show myvip  -c status -c binding_host_id -c id
+-----------------+--------------------------------------+
| Field           | Value                                |
+-----------------+--------------------------------------+
| binding_host_id |                                      |
| id              | 55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3 |
| status          | DOWN                                 |
+-----------------+--------------------------------------+


2. make the VM own the port:
ssh 10.64.154.3 -l cirros
$ sudo ip addr add 192.168.200.20 dev eth0
$ sudo arping -U 192.168.200.20
ARPING 192.168.200.20 from 192.168.200.20 eth0
^CSent 2 probe(s) (2 broadcast(s))
Received 0 response(s) (0 request(s), 0 broadcast(s))

at this moment ovn-sbctl shows our new port now bound to the chassis:
Chassis "1126ea9a-2860-4e5c-9ab5-ca1e8959edee"
    hostname: cpu35d
    Encap geneve
        ip: "10.64.145.100"
        options: {csum="true"}
    Port_Binding "55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3"
    Port_Binding cr-lrp-52d45a86-6cbf-43ff-9700-755490192441
    Port_Binding "d4aafa35-ab96-4451-9623-983a164f28dd"


3. Check the state of the port in Neutron
openstack port show myvip  -c status -c binding_host_id -c id

Actual results:
+-----------------+--------------------------------------+
| Field           | Value                                |
+-----------------+--------------------------------------+
| binding_host_id |                                      |
| id              | 55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3 |
| status          | DOWN                                 |
+-----------------+--------------------------------------+


Expected results:
+-----------------+--------------------------------------+
| Field           | Value                                |
+-----------------+--------------------------------------+
| binding_host_id | cpu35d                               |
| id              | 55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3 |
| status          | ACTIVE                               |
+-----------------+--------------------------------------+


Additional info:
I have 2 usecases:
- as an operator, I need this information and I can only get it from the southdb at the moment for vips.
- neutron-dynamic-routing only considers ports with a binding_host_id

Attaching an example patch for illustration purpose

Comment 1 François Rigault 2022-02-11 11:07:34 UTC
The patch I am testing is pretty naive and has a few issues:
- there is a loop between Neutron and OVN as both send each other updates with an increased revision number!
- other parts of the code (eg _set_unset_virtual_port_type) expect VIP to be unbound.

Comment 3 Rodolfo Alonso 2022-02-17 08:56:57 UTC
Hello Francois:

OVN ml2 plugin mimics the behaviour of OVS ml2 plugin. In OVS the VIP port is always DOWN, as you can see in [1] (this is the output in an OVS environment). Actually, we explicitly skip the port binding event for virtual ports [2].

I'll check with Neutron community if we can implement this feature for OVN plugin only, adding a feature gap between both drivers.

Regards.

[1]http://pastebin.test.redhat.com/1030497
[2]https://github.com/openstack/neutron/blob/3dfe6072421e3d5dc708a3bf065fb1a64ea3129a/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L487-L492

Comment 4 François Rigault 2022-02-17 09:24:36 UTC
I should really not have mentioned the "status", it is a nice to have. The real need is for the host binding, which is very useful for operators, and make dynamic routing work for VIP.

Conceptually in [2], L3GW ports and VIP are handled identically.
My wish is that this similarity extends to the way the host binding is updated upon Port_Bindings update events, since code is already made to update the port binding for L3GW ports.

Cheers

Comment 5 Rodolfo Alonso 2022-02-17 10:43:14 UTC
Hello Francois:

Right, this is something that is being also considered in [1] for OVS. I'll push an upstream patch to populate the host binding ID in the VIP port.

Regards.

[1]https://review.opendev.org/c/openstack/neutron/+/601336

Comment 7 François Rigault 2022-02-25 18:02:56 UTC
thanks! I'm going to check the patch next week.

I remember one annoying thing is that some parts like
https://github.com/openstack/neutron/blob/5c47957e89062c1e99b6e7dc28e96eff52ce514e/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L560

actually rely on the VIP port to be "unbound"

Comment 8 Rodolfo Alonso 2022-02-25 18:34:37 UTC
I didn't change that. We always consider the VIP port is unbound and changing this logic implies deeper changes in the code [1]. This patch is only adding/removing the portbinding host, just to provide additional information of where the VIP is being used, according to OVN.

[1]https://review.opendev.org/c/openstack/neutron/+/830624/3/neutron/plugins/ml2/plugin.py

Comment 9 François Rigault 2022-06-27 05:05:50 UTC
Hi! Thanks a ton for developing this patch.

I am testing it (or, a version of it, against Train since I run RHOSP 16.2). It worked for 3 minutes, then there is a "Maintenance task: Fixing resource" task that runs and transform the port from

```
parent_port         : []
requested_chassis   : []
tag                 : []
tunnel_key          : 5
type                : virtual
up                  : true
```

into

```
requested_chassis   : 3a8f006b-875a-441f-9102-854e8ef8b389
tag                 : []
tunnel_key          : 5
type                : ""
up                  : true
virtual_parent      : []
```

and at that point the VIP is gone. I am checking your other patch here
https://review.opendev.org/c/openstack/neutron/+/842297/2/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py

Comment 26 errata-xmlrpc 2024-01-16 14:31:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:0209