Created attachment 1859934 [details] set binding host on vips Description of problem: running `openstack port show` on a vip that is currently owned by a VM: the port shows as DOWN, unbound to any host, while it should says ACTIVE and bound to a host. Version-Release number of selected component (if applicable): python3-networking-ovn-7.4.2-2.20210601204831.el8ost.13.noarch.rpm How reproducible: all the time for the binding-host, seems intermittent for the DOWN vs ACTIVE port. Steps to Reproduce: 1. define a port for our server, and allow it to use a vip: openstack port create --network private --security-group prodlike serverport openstack server create --security-group prodlike --port serverport \ --key-name stack --flavor m1.small --image cirros myserver \ --availability-zone nova:cpu35d openstack floating ip set --port serverport 10.64.154.3 openstack port create myvip --network private \ --fixed-ip ip-address=192.168.200.20 openstack port set --allowed-address ip-address=192.168.200.20 serverport at this moment, ovn-sbctl should not show this port on any chassis: $ ovn-sbctl show Chassis "1126ea9a-2860-4e5c-9ab5-ca1e8959edee" hostname: cpu35d Encap geneve ip: "10.64.145.100" options: {csum="true"} Port_Binding cr-lrp-52d45a86-6cbf-43ff-9700-755490192441 Port_Binding "d4aafa35-ab96-4451-9623-983a164f28dd" The port is not bound yet: $ openstack port show myvip -c status -c binding_host_id -c id +-----------------+--------------------------------------+ | Field | Value | +-----------------+--------------------------------------+ | binding_host_id | | | id | 55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3 | | status | DOWN | +-----------------+--------------------------------------+ 2. make the VM own the port: ssh 10.64.154.3 -l cirros $ sudo ip addr add 192.168.200.20 dev eth0 $ sudo arping -U 192.168.200.20 ARPING 192.168.200.20 from 192.168.200.20 eth0 ^CSent 2 probe(s) (2 broadcast(s)) Received 0 response(s) (0 request(s), 0 broadcast(s)) at this moment ovn-sbctl shows our new port now bound to the chassis: Chassis "1126ea9a-2860-4e5c-9ab5-ca1e8959edee" hostname: cpu35d Encap geneve ip: "10.64.145.100" options: {csum="true"} Port_Binding "55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3" Port_Binding cr-lrp-52d45a86-6cbf-43ff-9700-755490192441 Port_Binding "d4aafa35-ab96-4451-9623-983a164f28dd" 3. Check the state of the port in Neutron openstack port show myvip -c status -c binding_host_id -c id Actual results: +-----------------+--------------------------------------+ | Field | Value | +-----------------+--------------------------------------+ | binding_host_id | | | id | 55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3 | | status | DOWN | +-----------------+--------------------------------------+ Expected results: +-----------------+--------------------------------------+ | Field | Value | +-----------------+--------------------------------------+ | binding_host_id | cpu35d | | id | 55c03aa0-2a21-4f28-bf5d-6ec2dcc5f7e3 | | status | ACTIVE | +-----------------+--------------------------------------+ Additional info: I have 2 usecases: - as an operator, I need this information and I can only get it from the southdb at the moment for vips. - neutron-dynamic-routing only considers ports with a binding_host_id Attaching an example patch for illustration purpose
The patch I am testing is pretty naive and has a few issues: - there is a loop between Neutron and OVN as both send each other updates with an increased revision number! - other parts of the code (eg _set_unset_virtual_port_type) expect VIP to be unbound.
Hello Francois: OVN ml2 plugin mimics the behaviour of OVS ml2 plugin. In OVS the VIP port is always DOWN, as you can see in [1] (this is the output in an OVS environment). Actually, we explicitly skip the port binding event for virtual ports [2]. I'll check with Neutron community if we can implement this feature for OVN plugin only, adding a feature gap between both drivers. Regards. [1]http://pastebin.test.redhat.com/1030497 [2]https://github.com/openstack/neutron/blob/3dfe6072421e3d5dc708a3bf065fb1a64ea3129a/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L487-L492
I should really not have mentioned the "status", it is a nice to have. The real need is for the host binding, which is very useful for operators, and make dynamic routing work for VIP. Conceptually in [2], L3GW ports and VIP are handled identically. My wish is that this similarity extends to the way the host binding is updated upon Port_Bindings update events, since code is already made to update the port binding for L3GW ports. Cheers
Hello Francois: Right, this is something that is being also considered in [1] for OVS. I'll push an upstream patch to populate the host binding ID in the VIP port. Regards. [1]https://review.opendev.org/c/openstack/neutron/+/601336
thanks! I'm going to check the patch next week. I remember one annoying thing is that some parts like https://github.com/openstack/neutron/blob/5c47957e89062c1e99b6e7dc28e96eff52ce514e/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L560 actually rely on the VIP port to be "unbound"
I didn't change that. We always consider the VIP port is unbound and changing this logic implies deeper changes in the code [1]. This patch is only adding/removing the portbinding host, just to provide additional information of where the VIP is being used, according to OVN. [1]https://review.opendev.org/c/openstack/neutron/+/830624/3/neutron/plugins/ml2/plugin.py
Hi! Thanks a ton for developing this patch. I am testing it (or, a version of it, against Train since I run RHOSP 16.2). It worked for 3 minutes, then there is a "Maintenance task: Fixing resource" task that runs and transform the port from ``` parent_port : [] requested_chassis : [] tag : [] tunnel_key : 5 type : virtual up : true ``` into ``` requested_chassis : 3a8f006b-875a-441f-9102-854e8ef8b389 tag : [] tunnel_key : 5 type : "" up : true virtual_parent : [] ``` and at that point the VIP is gone. I am checking your other patch here https://review.opendev.org/c/openstack/neutron/+/842297/2/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:0209