Bug 1671347
| Summary: | ovs-vswitchd crash in a deployment with ovn - OSP13 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Jon Uriarte <juriarte> | ||||||
| Component: | openvswitch | Assignee: | Numan Siddique <nusiddiq> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Jon Uriarte <juriarte> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 13.0 (Queens) | CC: | apevec, ccopello, chrisw, dalvarez, juriarte, lmartins, nusiddiq, rhos-maint, rsafrono | ||||||
| Target Milestone: | zstream | Keywords: | Triaged, ZStream | ||||||
| Target Release: | 13.0 (Queens) | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | openvswitch-2.9.0-95.el7fdn | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2019-03-14 13:28:34 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1670794 | ||||||||
| Attachments: |
|
||||||||
ovs-vswitchd logs during the crash (find attached complete ovs-vswitchd.log file): 2019-01-31T11:27:40.741Z|00001|ofproto_dpif_xlate(handler34)|WARN|Invalid Geneve tunnel metadata while processing udp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:60:c5:f0,dl_dst=fa:16:3e:41:cd:ed,nw_src=192.168.99.4,nw_dst=10.46.22.50,nw_tos=0,nw_ecn=0,nw_ttl=63,tp_src=42305,tp_dst=53 on bridge br-int 2019-01-31T11:27:41.348Z|00001|ofproto_dpif_xlate(handler40)|WARN|Invalid Geneve tunnel metadata while processing tcp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:be:76:44,dl_dst=52:54:00:08:3e:d7,nw_src=172.16.40.10,nw_dst=10.46.22.24,nw_tos=0,nw_ecn=0,nw_ttl=63,tp_src=57364,tp_dst=5000,tcp_flags=ack on bridge br-int 2019-01-31T11:27:42.826Z|00001|ofproto_dpif_xlate(handler38)|WARN|Invalid Geneve tunnel metadata while processing tcp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:18:ea:9c,dl_dst=fa:16:3e:f0:7d:3f,nw_src=172.24.0.14,nw_dst=172.24.0.5,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9443,tp_dst=37734,tcp_flags=psh|ack on bridge br-int 2019-01-31T11:27:42.826Z|00001|ofproto_dpif_xlate(handler41)|WARN|Invalid Geneve tunnel metadata while processing tcp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:18:ea:9c,dl_dst=fa:16:3e:f0:7d:3f,nw_src=172.24.0.14,nw_dst=172.24.0.5,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9443,tp_dst=37734,tcp_flags=psh|ack on bridge br-int Created attachment 1525373 [details]
ovs-vswitchd logs
It seems it was fixed [1] in ovs master and 2.10 branches, but not in 2.9 [1] https://patchwork.ozlabs.org/patch/979576/ Same behaviour in OSP 14, almost the same core dump (find it attached):
#0 tun_metadata_to_geneve__ (flow=flow@entry=0x7ffc8183c5d8, b=b@entry=0x7ffc81820288, crit_opt=crit_opt@entry=0x7ffc8181ff67) at ../lib/tun-metadata.c:676
#1 0x00005624864a8f6b in tun_metadata_to_geneve_nlattr_flow (b=0x7ffc81820288, flow=0x7ffc8183c590) at ../lib/tun-metadata.c:706
#2 tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7ffc8183c590, flow=flow@entry=0x7ffc8183c590, key=key@entry=0x0, b=b@entry=0x7ffc81820288) at ../lib/tun-metadata.c:810
#3 0x000056248642a4cc in tun_key_to_attr (a=a@entry=0x7ffc81820288, tun_key=tun_key@entry=0x7ffc8183c590, tun_flow_key=tun_flow_key@entry=0x7ffc8183c590, key_buf=key_buf@entry=0x0, tnl_type=<optimized out>,
tnl_type@entry=0x0) at ../lib/odp-util.c:2885
#4 0x0000562486435ce2 in odp_key_from_dp_packet (buf=buf@entry=0x7ffc81820288, packet=0x7ffc8183c480) at ../lib/odp-util.c:5908
#5 0x00005624864b8df0 in dpif_netlink_encode_execute (buf=0x7ffc81820288, d_exec=0x7ffc8183c0e8, dp_ifindex=<optimized out>) at ../lib/dpif-netlink.c:1742
#6 dpif_netlink_operate__ (dpif=dpif@entry=0x56248833ca50, ops=ops@entry=0x7ffc8183c0d8, n_ops=n_ops@entry=1) at ../lib/dpif-netlink.c:1828
#7 0x00005624864b95ce in dpif_netlink_operate_chunks (n_ops=1, ops=0x7ffc8183c0d8, dpif=<optimized out>) at ../lib/dpif-netlink.c:2120
#8 dpif_netlink_operate (dpif_=0x56248833ca50, ops=0x7ffc8183c0d8, n_ops=<optimized out>) at ../lib/dpif-netlink.c:2156
#9 0x00005624863fac53 in dpif_operate (dpif=0x56248833ca50, ops=ops@entry=0x7ffc8183c0d8, n_ops=n_ops@entry=1) at ../lib/dpif.c:1349
#10 0x00005624863fb438 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7ffc8183c170) at ../lib/dpif.c:1314
#11 0x00005624863a8951 in nxt_resume (ofproto_=0x56248835fea0, pin=0x7ffc8183cbe0) at ../ofproto/ofproto-dpif.c:5085
#12 0x00005624863946f6 in handle_nxt_resume (ofconn=ofconn@entry=0x5624882fe690, oh=oh@entry=0x562488326ce0) at ../ofproto/ofproto.c:3619
#13 0x00005624863a08cb in handle_openflow__ (msg=0x5624883262c0, ofconn=0x5624882fe690) at ../ofproto/ofproto.c:8167
#14 handle_openflow (ofconn=0x5624882fe690, ofp_msg=0x5624883262c0) at ../ofproto/ofproto.c:8288
#15 0x00005624863d2583 in ofconn_run (handle_openflow=0x5624863a0600 <handle_openflow>, ofconn=0x5624882fe690) at ../ofproto/connmgr.c:1446
#16 connmgr_run (mgr=0x562488329bd0, handle_openflow=handle_openflow@entry=0x5624863a0600 <handle_openflow>) at ../ofproto/connmgr.c:365
#17 0x000056248639a4ce in ofproto_run (p=0x56248835fea0) at ../ofproto/ofproto.c:1825
#18 0x0000562486387bac in bridge_run__ () at ../vswitchd/bridge.c:2944
#19 0x000056248638da08 in bridge_run () at ../vswitchd/bridge.c:3002
#20 0x00005624861cce15 in main (argc=12, argv=0x7ffc8183e088) at ../vswitchd/ovs-vswitchd.c:125
Version-Release number of selected component (if applicable):
OSP 14 2019-01-17.2 puddle
Controller (RHEL 7.6):
[root@controller-0 tmp]# rpm -qa | grep openvswitch
openvswitch2.10-2.10.0-28.el7fdp.1.x86_64
rhosp-openvswitch-2.10-0.1.el7ost.noarch
openvswitch2.10-debuginfo-2.10.0-28.el7fdp.1.x86_64
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
[heat-admin@controller-0 ~]$ sudo docker images | grep ovn
192.168.24.1:8787/rhosp14/openstack-ovn-northd 2019-01-16.1 03bdcefcab6c 8 days ago 752 MB
192.168.24.1:8787/rhosp14/openstack-ovn-northd pcmklatest 03bdcefcab6c 8 days ago 752 MB
192.168.24.1:8787/rhosp14/openstack-neutron-server-ovn 2019-01-16.1 29207e903690 8 days ago 920 MB
192.168.24.1:8787/rhosp14/openstack-nova-novncproxy 2019-01-16.1 9803d4995c22 2 weeks ago 880 MB
192.168.24.1:8787/rhosp14/openstack-ovn-controller 2019-01-16.1 7cde11f5e83e 2 weeks ago 626 MB
neutron_api container:
python-neutron-fwaas-13.0.1-0.20180910012802.3c6cfbc.el7ost.noarch
openstack-neutron-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
openstack-neutron-common-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
openstack-neutron-ml2-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
puppet-neutron-13.3.1-0.20181013115834.el7ost.noarch
python-neutron-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
openstack-neutron-lbaas-13.0.1-0.20180831185310.e0cca6e.el7ost.noarch
python2-neutronclient-6.9.1-0.20180925041810.7eba94e.el7ost.noarch
python-neutron-lbaas-13.0.1-0.20180831185310.e0cca6e.el7ost.noarch
openstack-neutron-fwaas-13.0.1-0.20180910012802.3c6cfbc.el7ost.noarch
python2-neutron-lib-1.18.0-0.20180816094046.67865c7.el7ost.noarch
ovn_controller container:
puppet-ovn-13.3.1-0.20181013120724.38e2e33.el7ost.noarch
openvswitch2.10-ovn-common-2.10.0-28.el7fdp.1.x86_64
rhosp-openvswitch-ovn-host-2.10-0.1.el7ost.noarch
rhosp-openvswitch-ovn-common-2.10-0.1.el7ost.noarch
openvswitch2.10-ovn-host-2.10.0-28.el7fdp.1.x86_64
Created attachment 1525778 [details]
ovs-vswitchd core dump
Looking into the backtrace, the patch [1] should fix the issue [1] - https://github.com/openvswitch/ovs/commit/bed941ba0f14854683c241fa9bff3d49dd2efeee (In reply to Numan Siddique from comment #6) > Looking into the backtrace, the patch [1] should fix the issue > > [1] - > https://github.com/openvswitch/ovs/commit/ > bed941ba0f14854683c241fa9bff3d49dd2efeee Comment #3 already says that :) @juriarte from shiftstack team confirmed that the crash doesn't happen with this version. Still working on validation Please Jon, update this after your findings. Waiting to [1] to be fixed, as it blocks Openshift installation. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1684077 Verified in OSP13 2019-02-25.2 puddle.
[stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed
13 -p 2019-02-25.2
[heat-admin@controller-0 ~]$ rpm -qa | grep openvswitch
openvswitch-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-95.el7fdp.x86_64
openstack-neutron-openvswitch-12.0.5-4.el7ost.noarch
python-openvswitch-2.9.0-95.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-ovn-central-2.9.0-95.el7fdp.x86_64
[heat-admin@controller-0 ~]$ sudo grep -r service_plugins /var/lib/config-data/puppet-generated/neutron/etc/
/var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf:#service_plugins =
/var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf:service_plugins=qos,ovn-router,trunk
[heat-admin@controller-0 ~]$ sudo docker images | grep ovn
192.168.24.1:8787/rhosp13/openstack-neutron-server-ovn 2019-02-24.1 9626dfbcacff 4 days ago 856 MB
192.168.24.1:8787/rhosp13/openstack-nova-novncproxy 2019-02-24.1 8e3ba946872f 4 days ago 808 MB
192.168.24.1:8787/rhosp13/openstack-ovn-controller 2019-02-24.1 5ff907fbf00f 4 days ago 590 MB
192.168.24.1:8787/rhosp13/openstack-ovn-northd 2019-02-24.1 2efa8840e5bf 4 days ago 716 MB
192.168.24.1:8787/rhosp13/openstack-ovn-northd pcmklatest 2efa8840e5bf 4 days ago 716 MB
- neutron_api container:
[heat-admin@controller-0 ~]$ sudo docker exec -it neutron_api rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-2.9.0-95.el7fdp.x86_64
python-openvswitch-2.9.0-95.el7fdp.x86_64
- ovn_controller container:
[heat-admin@controller-0 ~]$ sudo docker exec -it ovn_controller rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-2.9.0-95.el7fdp.x86_64
python-openvswitch-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-95.el7fdp.x86_64
Verification steps:
1. Deploy OSP 13 with octavia and ovn backend in a hybrid environment
ยท Set in Jenkins job:
ir_tripleo_overcloud_templates: 'ovn-extras,octavia'
ir_tripleo_overcloud_network_backend:
- 'geneve'
ir_tripleo_overcloud_deploy_override_options: |-
--config-heat ControllerExtraConfig.neutron::agents::l3::extensions="fip_qos" \
--network-ovn yes \
--extra-deploy-params='-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml'
OSP installation must set - "service_plugins=qos,ovn-router,trunk" in neutron.conf
2. Deploy OCP 3.11 with Kuryr:
Note: A warkaround for [1] has been applied - "yum install openshift-ansible --enablerepo=rhelosp-rhel-7.6-server-opt"
ansible-playbook --user openshift -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py" -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/provision.yml"
3. Check the provision playbook finishes successfully and the stack openshift.example.com is created in the overcloud.
(shiftstack) [cloud-user@ansible-host-0 ~]$ openstack stack list
+--------------------------------------+-----------------------+-----------------+----------------------+--------------+
| ID | Stack Name | Stack Status | Creation Time | Updated Time |
+--------------------------------------+-----------------------+-----------------+----------------------+--------------+
| 2fa65297-882c-4a2f-8b5f-dfe8aeca1711 | openshift.example.com | CREATE_COMPLETE | 2019-03-01T13:39:49Z | None |
+--------------------------------------+-----------------------+-----------------+----------------------+--------------+
(shiftstack) [cloud-user@ansible-host-0 ~]$ openstack server list
+--------------------------------------+------------------------------------+--------+------------------------------------------------------------------------+---------------------------------------+-----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+------------------------------------+--------+------------------------------------------------------------------------+---------------------------------------+-----------+
| 080e3c8e-b644-489e-bc0c-97d1fc13b8b9 | infra-node-0.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.9, 10.46.22.113 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.node |
| eb91ed29-a48b-49f8-8ce6-b32f4f8e5531 | master-0.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.8, 10.46.22.103 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.master |
| 8092cf88-3623-47bc-98e5-2b01df2c3e8b | app-node-0.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.23, 10.46.22.96 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.node |
| 96db78db-1c70-4d47-886b-d0f3461a5f22 | app-node-1.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.4, 10.46.22.91 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.node |
| c89b10ce-148d-47ba-889e-5ba3c6228737 | ansible_host-0 | ACTIVE | private_openshift=172.16.40.18, 10.46.22.105 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.small |
| c5b6b356-42c0-456f-8be1-28e1090aedd2 | openshift_dns-0 | ACTIVE | openshift_dns=192.168.23.13, 10.46.22.95 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.small |
+--------------------------------------+------------------------------------+--------+------------------------------------------------------------------------+---------------------------------------+-----------+
4. Check ovs-vswitchd service has not been restarted in the controller:
[heat-admin@controller-0 ~]$ sudo grep vswitchd /var/log/messages
Mar 1 06:27:18 controller-0 ovs-ctl: Starting ovs-vswitchd [ OK ]
[heat-admin@controller-0 ~]$ sudo grep -r "ovs-ctl: " /var/log/messages
Mar 1 06:27:18 controller-0 ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning).
Mar 1 06:27:18 controller-0 ovs-ctl: Creating empty database /etc/openvswitch/conf.db [ OK ]
Mar 1 06:27:18 controller-0 ovs-ctl: Starting ovsdb-server [ OK ]
Mar 1 06:27:18 controller-0 ovs-ctl: Configuring Open vSwitch system IDs [ OK ]
Mar 1 06:27:18 controller-0 ovs-ctl: Inserting openvswitch module [ OK ]
Mar 1 06:27:18 controller-0 ovs-ctl: Starting ovs-vswitchd [ OK ]
Mar 1 06:27:18 controller-0 ovs-ctl: Enabling remote OVSDB managers [ OK ]
[heat-admin@controller-0 ~]$ sudo systemctl status ovs-vswitchd
โ ovs-vswitchd.service - Open vSwitch Forwarding Unit
Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled)
Active: active (running) since vie 2019-03-01 11:27:18 UTC; 3h 50min ago
Main PID: 4672 (ovs-vswitchd)
Tasks: 18
Memory: 147.8M
CGroup: /system.slice/ovs-vswitchd.service
โโ4672 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --p...
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1684077
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0552 |
Description of problem: I'm trying to install OCP (Openshift) on top of an OSP 13 with OVN. When deploying openshift as a stack in the overcloud the ovs-vswitchd process starts crashing in the controller node, and the systemd restarts it automatically. After some crash-start loops the systemd gives up and the ovs service remains stopped, so connectivity with the controller is lost. (ovs-vswitchd can be restarted from the controller console but it will remain in the crash-start loop until it is not started any more) Backtrace: #0 tun_metadata_to_geneve__ (flow=flow@entry=0x7fffd50735d0, b=b@entry=0x7fffd5057298, crit_opt=crit_opt@entry=0x7fffd5056fa7) at ../lib/tun-metadata.c:676 #1 0x000055f5ef9268bb in tun_metadata_to_geneve_nlattr_flow (b=0x7fffd5057298, flow=0x7fffd5073590) at ../lib/tun-metadata.c:706 #2 tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7fffd5073590, flow=flow@entry=0x7fffd5073590, key=key@entry=0x0, b=b@entry=0x7fffd5057298) at ../lib/tun-metadata.c:810 #3 0x000055f5ef8ac031 in tun_key_to_attr (a=a@entry=0x7fffd5057298, tun_key=tun_key@entry=0x7fffd5073590, tun_flow_key=tun_flow_key@entry=0x7fffd5073590, key_buf=key_buf@entry=0x0) at ../lib/odp-util.c:2778 #4 0x000055f5ef8b73ef in odp_key_from_dp_packet (buf=buf@entry=0x7fffd5057298, packet=0x7fffd5073480) at ../lib/odp-util.c:5633 #5 0x000055f5ef936080 in dpif_netlink_encode_execute (buf=0x7fffd5057298, d_exec=0x7fffd50730e8, dp_ifindex=<optimized out>) at ../lib/dpif-netlink.c:1718 #6 dpif_netlink_operate__ (dpif=dpif@entry=0x55f5f1406c90, ops=ops@entry=0x7fffd50730d8, n_ops=n_ops@entry=1) at ../lib/dpif-netlink.c:1804 #7 0x000055f5ef9366d6 in dpif_netlink_operate_chunks (n_ops=1, ops=0x7fffd50730d8, dpif=<optimized out>) at ../lib/dpif-netlink.c:2103 #8 dpif_netlink_operate (dpif_=0x55f5f1406c90, ops=0x7fffd50730d8, n_ops=<optimized out>) at ../lib/dpif-netlink.c:2139 #9 0x000055f5ef87e163 in dpif_operate (dpif=0x55f5f1406c90, ops=ops@entry=0x7fffd50730d8, n_ops=n_ops@entry=1) at ../lib/dpif.c:1350 #10 0x000055f5ef87e948 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7fffd5073170) at ../lib/dpif.c:1315 #11 0x000055f5ef82ff31 in nxt_resume (ofproto_=0x55f5f14055e0, pin=0x7fffd5073c00) at ../ofproto/ofproto-dpif.c:4879 #12 0x000055f5ef81c416 in handle_nxt_resume (ofconn=ofconn@entry=0x55f5f142fb40, oh=oh@entry=0x55f5f148a2d0) at ../ofproto/ofproto.c:3607 #13 0x000055f5ef82849b in handle_openflow__ (msg=0x55f5f1479a10, ofconn=0x55f5f142fb40) at ../ofproto/ofproto.c:8125 #14 handle_openflow (ofconn=0x55f5f142fb40, ofp_msg=0x55f5f1479a10) at ../ofproto/ofproto.c:8246 #15 0x000055f5ef858b23 in ofconn_run (handle_openflow=0x55f5ef8281d0 <handle_openflow>, ofconn=0x55f5f142fb40) at ../ofproto/connmgr.c:1432 #16 connmgr_run (mgr=0x55f5f1405b30, handle_openflow=handle_openflow@entry=0x55f5ef8281d0 <handle_openflow>) at ../ofproto/connmgr.c:363 #17 0x000055f5ef8221be in ofproto_run (p=0x55f5f14055e0) at ../ofproto/ofproto.c:1816 #18 0x000055f5ef80f6bc in bridge_run__ () at ../vswitchd/bridge.c:2939 #19 0x000055f5ef815738 in bridge_run () at ../vswitchd/bridge.c:2997 #20 0x000055f5ef65a845 in main (argc=12, argv=0x7fffd5075088) at ../vswitchd/ovs-vswitchd.c:121 Note - in order to dump the cores do the following: sysctl -w fs.suid_dumpable=1 echo /tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t > /proc/sys/kernel/core_pattern The cores will be dumped in /tmp/ Version-Release number of selected component (if applicable): OSP 13 2019-01-22.1 puddle Controller (RHEL 7.6): [root@controller-0 ~]# rpm -qa | grep openvswitch openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch openvswitch-ovn-host-2.9.0-83.el7fdp.1.x86_64 openvswitch-2.9.0-83.el7fdp.1.x86_64 openvswitch-debuginfo-2.9.0-83.el7fdp.1.x86_64 openstack-neutron-openvswitch-12.0.5-3.el7ost.noarch openvswitch-ovn-common-2.9.0-83.el7fdp.1.x86_64 openvswitch-ovn-central-2.9.0-83.el7fdp.1.x86_64 python-openvswitch-2.9.0-83.el7fdp.1.x86_64 [root@controller-0 ~]# sudo docker images | grep ovn 192.168.24.1:8787/rhosp13/openstack-neutron-server-ovn 2019-01-21.1 af2f2a07db50 8 days ago 867 MB 192.168.24.1:8787/rhosp13/openstack-ovn-northd 2019-01-21.1 fc759e84aca6 8 days ago 728 MB 192.168.24.1:8787/rhosp13/openstack-ovn-northd pcmklatest fc759e84aca6 8 days ago 728 MB 192.168.24.1:8787/rhosp13/openstack-nova-novncproxy 2019-01-21.1 b97fc5ca7fcc 12 days ago 819 MB 192.168.24.1:8787/rhosp13/openstack-ovn-controller 2019-01-21.1 ff46d4267134 12 days ago 602 MB neutron_api container: python-neutron-12.0.5-3.el7ost.noarch openstack-neutron-lbaas-12.0.1-0.20181019202914.b9b6b6a.el7ost.noarch python-neutron-lbaas-12.0.1-0.20181019202914.b9b6b6a.el7ost.noarch openstack-neutron-fwaas-12.0.1-1.el7ost.noarch python2-neutronclient-6.7.0-1.el7ost.noarch puppet-neutron-12.4.1-3.ed05e01git.el7ost.noarch python-neutron-fwaas-12.0.1-1.el7ost.noarch openstack-neutron-12.0.5-3.el7ost.noarch openstack-neutron-common-12.0.5-3.el7ost.noarch openstack-neutron-ml2-12.0.5-3.el7ost.noarch python2-neutron-lib-1.13.0-1.el7ost.noarch ovn_controller container: puppet-ovn-12.4.0-1.el7ost.noarch openvswitch-ovn-common-2.9.0-83.el7fdp.1.x86_64 openvswitch-ovn-host-2.9.0-83.el7fdp.1.x86_64 How reproducible: always Steps to Reproduce: 1. Install OSP 13 with Octavia and OVN (I'm using infrared) (openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ -e /home/stack/hybrid_templates/config_lvm.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/hybrid_templates/network/network-environment.yaml \ -e /home/stack/hybrid_templates/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml \ -e /home/stack/hybrid_templates/debug.yaml \ -e /home/stack/hybrid_templates/docker-images.yaml \ -e /home/stack/hybrid_templates/prereq.yaml \ -e /home/stack/hybrid_templates/config_heat.yaml \ -e /home/stack/hybrid_templates/nodes_data.yaml \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \ -e /home/stack/hybrid_templates/ovn-extras.yaml \ --log-file overcloud_deployment_32.log) 2. Run OCP 3.11 installation playbooks: ansible-playbook --user openshift -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py" -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/provision.yml" 3. Check the provision playbook finishes successfully and the stack openshift.example.com is created in the overcloud. Actual results: The provision playbook does fail due to connectivity. TASK [Wait for the the nodes to come up] ************************************************************************************************************************************************************************** ok: [master-0.openshift.example.com] ok: [app-node-0.openshift.example.com] ok: [infra-node-0.openshift.example.com] fatal: [app-node-1.openshift.example.com]: FAILED! => {"changed": false, "elapsed": 7377, "msg": "timed out waiting for ping module test success: Failed to connect to the host via ssh: Warning: Permanently added '10.46.22.38' (ECDSA) to the list of known hosts.\r\nAuthentication failed.\r\n"} Connectivity with the controller is lost, and openstack commands cannot be executed. Expected results: The provision ends successfully and the stack is created. Additional info: The ovs-openvswitchd service can be restarted by log-in through the console to the controller node. For that the controller's root password must be changed, as it is requested during log-in via console.