Bug 2115035
Summary: | After a Controller reboot ovn_metadata_agent goes into unhealhy state | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Julia Marciano <jmarcian> |
Component: | openvswitch | Assignee: | Miro Tomaska <mtomaska> |
Status: | CLOSED ERRATA | QA Contact: | Eran Kuris <ekuris> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 17.0 (Wallaby) | CC: | apevec, bcafarel, chrisw, dalvarez, ekuris, fleitner, jamsmith, jpretori, jschluet, mkrcmari, mtomaska, scohen, spower |
Target Milestone: | ga | Keywords: | Regression, TestOnly, Triaged |
Target Release: | 17.0 | Flags: | mtomaska:
needinfo-
mtomaska: needinfo- |
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openvswitch2.17-2.17.0-32.1 | Doc Type: | Bug Fix |
Doc Text: |
This update fixes a bug that caused intermittent SSL connection problems between services such as ovn-metadata-agent and the OVN southbound database.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-09-21 12:24:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2114617 |
Description
Julia Marciano
2022-08-03 18:20:51 UTC
Looks like the root cause of this issue is OVS switching from pyOpenSSL to python std library socket module. [1]. Python socket.send[2] does not allow non-zero flag for SSL. Which was ignored in pyOpenSSL send function[3] [1] https://github.com/openvswitch/ovs/commit/68543dd523bd00f53fa7b91777b962ccb22ce679 [2] https://github.com/python/cpython/blob/main/Lib/ssl.py#L1141-L1156 [3] https://github.com/pyca/pyopenssl/blob/38f9b4e524ac6479d57021bba2270df84d85b672/src/OpenSSL/SSL.py#L1844 Patch is posted upstream for review. https://github.com/ovsrobot/ovs/commit/f09a55946cc83583c2e93be632e50f51ea830322 trac team deemed this a GA blocker but not a blocker for beta Verified: [stack@undercloud-0 ~]$ cat core_puddle_version RHOS-17.0-RHEL-9-20220816.n.2[stack@undercloud-0 ~]$ [root@controller-0 ~]# rpm -qa|grep openvsw openvswitch2.17-2.17.0-32.1.el9fdp.x86_64 After hard reboot (echo b > /proc/sysrq-trigger) of controller-2, ovn-metadata-agents are healthy on both the compute nodes: [heat-admin@compute-0 ~]$ sudo -i [root@compute-0 ~]# podman ps|grep meta 00534cbdb30e undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-neutron-metadata-agent-ovn:17.0_20220816.1 kolla_start 23 hours ago Up 23 hours ago (healthy) ovn_metadata_agent [root@compute-1 ~]# podman ps|grep metadata 1a553fa027e7 undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-neutron-metadata-agent-ovn:17.0_20220816.1 kolla_start 23 hours ago Up 23 hours ago (healthy) ovn_metadata_agent [root@compute-1 ~]# ssh connection to the newly created instance succeeded: [Tue Aug 23 12:05:49 AM UTC 2022] Trying to ssh to 10.0.0.161 cirros Instance instance_d1f5085f0e is reachable via 10.0.0.161 Werified by automated tests as well: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/Phase3/view/OSP%2017.0/view/PidOne/job/DFG-pidone-sanity-17.0_director-rhel-virthost-3cont_2comp_1ipa-ipv4-geneve-ansible-sts-sanity-tls-everywhere/75/artifact/.sh/ansible_sts-ha-tests.log *** Bug 2114617 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543 |