Bug 2169349
| Summary: | [ovn provider] Avoid use of ovn-metadaport for HM healt check packets | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Fernando Royo <froyo> |
| Component: | python-ovn-octavia-provider | Assignee: | Fernando Royo <froyo> |
| Status: | CLOSED ERRATA | QA Contact: | Omer Schwartz <oschwart> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 17.0 (Wallaby) | CC: | gthiemon, jamsmith, mburns, mdemaced, oschwart |
| Target Milestone: | ga | Keywords: | Triaged |
| Target Release: | 17.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | python-ovn-octavia-provider-1.0.3-1.20230223161047.82a4691.el9ost | Doc Type: | Bug Fix |
| Doc Text: |
Before this update, instances lost communication with the ovn-metadata-port because the load balancer health monitor replied to the ARP requests for the OVN metadata agent's IP, causing the request going to the metadata agent to be sent to another MAC address. With this update, the ovn-controller conducts back-end checks by using a dedicated port instead of the ovn-metadata-port. When establishing a health monitor for a load balancer pool, ensure that there is an available IP in the VIP load balancer's subnet. This port is distinct for each subnet, and various health monitors in the same subnet can reuse the port. Health monitor checks no longer impact ovn-metadata-port communications for instances.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-08-16 01:13:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Fernando Royo
2023-02-13 11:39:32 UTC
Using the puddle RHOS-17.1-RHEL-9-20230426.n.1, I ran the following commands:
(overcloud) [stack@undercloud-0 ~]$ openstack port list | grep hm
| d7e6c687-6118-4d36-b764-ef8407a61dbb | ovn-lb-hm-576dfdfb-e8ea-4188-9b81-79b96472a3fb | fa:16:3e:aa:20:71 | ip_address='10.0.64.3', subnet_id='576dfdfb-e8ea-4188-9b81-79b96472a3fb' | DOWN |
We can see that the ovn-lb-hm port exists and uses ip_address='10.0.64.3', which should be the source ip the health monitor uses for each member.
Some details about the LB members:
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer status show lb_ovn
...
"members": [
{
"id": "7cd7ebe8-f73c-4a2a-a22f-2b44bd4b8c06",
"name": "tcp_member1",
"operating_status": "ONLINE",
"provisioning_status": "ACTIVE",
"address": "10.0.64.47",
"protocol_port": 8080
},
{
"id": "73e2dd4c-de26-4a87-8b3e-892d0c6f9b09",
"name": "tcp_member2",
"operating_status": "ONLINE",
"provisioning_status": "ACTIVE",
"address": "10.0.64.56",
"protocol_port": 8080
}
]
(overcloud) [stack@undercloud-0 ~]$ ssh tripleo-admin: Permanently added 'controller-0.ctlplane' (ED25519) to the list of known hosts.
Register this system with Red Hat Insights: insights-client --register
Create an account or view all your systems at https://red.ht/insights-dashboard
Last login: Fri May 5 08:32:35 2023 from 192.168.24.1
[tripleo-admin@controller-0 ~]$ sudo bash
[root@controller-0 tripleo-admin]# podman exec -it -uroot ovn_controller ovn-sbctl list Service_Monitor
_uuid : edfacd21-5a89-40ab-ab01-ac8adb0fc39a
external_ids : {}
ip : "10.0.64.47"
logical_port : "f44a701b-9376-4a89-b544-57eca790b79c"
options : {failure_count="3", interval="10", success_count="4", timeout="5"}
port : 8080
protocol : tcp
src_ip : "10.0.64.3"
src_mac : "16:ed:b6:15:9c:6a"
status : online
_uuid : 305c3adf-a42e-4852-844c-8032aca7a8e1
external_ids : {}
ip : "10.0.64.56"
logical_port : "ef1b7570-4de5-41fc-88b2-6c4f5b033269"
options : {failure_count="3", interval="10", success_count="4", timeout="5"}
port : 8080
protocol : tcp
src_ip : "10.0.64.3"
src_mac : "16:ed:b6:15:9c:6a"
status : online
We can see that both members use the 10.0.64.3 source ip (src_ip), and also that the ip addresses match the ones we got from the loadbalancer status show command.
Communication via metadata-port is also possible with this fix:
(overcloud) [stack@undercloud-0 ~]$ ssh tripleo-admin
Warning: Permanently added 'compute-0.ctlplane' (ED25519) to the list of known hosts.
Register this system with Red Hat Insights: insights-client --register
Create an account or view all your systems at https://red.ht/insights-dashboard
Last login: Fri May 5 08:46:40 2023 from 192.168.24.1
[tripleo-admin@compute-0 ~]$ ip net
ovnmeta-89249e30-ff7c-4748-8279-39c5b8c21a09 (id: 0)
[tripleo-admin@compute-0 ~]$ sudo ip net e ovnmeta-89249e30-ff7c-4748-8279-39c5b8c21a09 ssh cirros.64.47
cirros.64.47's password:
$ date
Fri May 5 09:52:39 UTC 2023
The ssh connection was executed successfully via the ovn-metadata-port.
The BZ looks good to me and I am moving its status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:4577 |