Bug 1671347 - ovs-vswitchd crash in a deployment with ovn - OSP13
Summary: ovs-vswitchd crash in a deployment with ovn - OSP13
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: zstream
: 13.0 (Queens)
Assignee: Numan Siddique
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks: 1670794
TreeView+ depends on / blocked
 
Reported: 2019-01-31 13:13 UTC by Jon Uriarte
Modified: 2019-07-25 06:51 UTC (History)
9 users (show)

Fixed In Version: openvswitch-2.9.0-95.el7fdn
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-14 13:28:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ovs-vswitchd logs (28.30 KB, text/plain)
2019-01-31 14:15 UTC, Jon Uriarte
no flags Details
ovs-vswitchd core dump (912.80 KB, application/gzip)
2019-02-01 10:12 UTC, Jon Uriarte
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0552 0 None None None 2019-03-14 13:28:41 UTC

Description Jon Uriarte 2019-01-31 13:13:03 UTC
Description of problem:

I'm trying to install OCP (Openshift) on top of an OSP 13 with OVN.
When deploying openshift as a stack in the overcloud the ovs-vswitchd process starts crashing in the controller node, and the systemd restarts it automatically.
After some crash-start loops the systemd gives up and the ovs service remains stopped, so connectivity with the controller is lost.
(ovs-vswitchd can be restarted from the controller console but it will remain in the crash-start loop until it is not started any more)

Backtrace:
#0  tun_metadata_to_geneve__ (flow=flow@entry=0x7fffd50735d0, b=b@entry=0x7fffd5057298, crit_opt=crit_opt@entry=0x7fffd5056fa7) at ../lib/tun-metadata.c:676
#1  0x000055f5ef9268bb in tun_metadata_to_geneve_nlattr_flow (b=0x7fffd5057298, flow=0x7fffd5073590) at ../lib/tun-metadata.c:706
#2  tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7fffd5073590, flow=flow@entry=0x7fffd5073590, key=key@entry=0x0, b=b@entry=0x7fffd5057298) at ../lib/tun-metadata.c:810
#3  0x000055f5ef8ac031 in tun_key_to_attr (a=a@entry=0x7fffd5057298, tun_key=tun_key@entry=0x7fffd5073590, tun_flow_key=tun_flow_key@entry=0x7fffd5073590, key_buf=key_buf@entry=0x0) at ../lib/odp-util.c:2778
#4  0x000055f5ef8b73ef in odp_key_from_dp_packet (buf=buf@entry=0x7fffd5057298, packet=0x7fffd5073480) at ../lib/odp-util.c:5633
#5  0x000055f5ef936080 in dpif_netlink_encode_execute (buf=0x7fffd5057298, d_exec=0x7fffd50730e8, dp_ifindex=<optimized out>) at ../lib/dpif-netlink.c:1718
#6  dpif_netlink_operate__ (dpif=dpif@entry=0x55f5f1406c90, ops=ops@entry=0x7fffd50730d8, n_ops=n_ops@entry=1) at ../lib/dpif-netlink.c:1804
#7  0x000055f5ef9366d6 in dpif_netlink_operate_chunks (n_ops=1, ops=0x7fffd50730d8, dpif=<optimized out>) at ../lib/dpif-netlink.c:2103
#8  dpif_netlink_operate (dpif_=0x55f5f1406c90, ops=0x7fffd50730d8, n_ops=<optimized out>) at ../lib/dpif-netlink.c:2139
#9  0x000055f5ef87e163 in dpif_operate (dpif=0x55f5f1406c90, ops=ops@entry=0x7fffd50730d8, n_ops=n_ops@entry=1) at ../lib/dpif.c:1350
#10 0x000055f5ef87e948 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7fffd5073170) at ../lib/dpif.c:1315
#11 0x000055f5ef82ff31 in nxt_resume (ofproto_=0x55f5f14055e0, pin=0x7fffd5073c00) at ../ofproto/ofproto-dpif.c:4879
#12 0x000055f5ef81c416 in handle_nxt_resume (ofconn=ofconn@entry=0x55f5f142fb40, oh=oh@entry=0x55f5f148a2d0) at ../ofproto/ofproto.c:3607
#13 0x000055f5ef82849b in handle_openflow__ (msg=0x55f5f1479a10, ofconn=0x55f5f142fb40) at ../ofproto/ofproto.c:8125
#14 handle_openflow (ofconn=0x55f5f142fb40, ofp_msg=0x55f5f1479a10) at ../ofproto/ofproto.c:8246
#15 0x000055f5ef858b23 in ofconn_run (handle_openflow=0x55f5ef8281d0 <handle_openflow>, ofconn=0x55f5f142fb40) at ../ofproto/connmgr.c:1432
#16 connmgr_run (mgr=0x55f5f1405b30, handle_openflow=handle_openflow@entry=0x55f5ef8281d0 <handle_openflow>) at ../ofproto/connmgr.c:363
#17 0x000055f5ef8221be in ofproto_run (p=0x55f5f14055e0) at ../ofproto/ofproto.c:1816
#18 0x000055f5ef80f6bc in bridge_run__ () at ../vswitchd/bridge.c:2939
#19 0x000055f5ef815738 in bridge_run () at ../vswitchd/bridge.c:2997
#20 0x000055f5ef65a845 in main (argc=12, argv=0x7fffd5075088) at ../vswitchd/ovs-vswitchd.c:121

Note - in order to dump the cores do the following:
sysctl -w fs.suid_dumpable=1
echo /tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t > /proc/sys/kernel/core_pattern

The cores will be dumped in /tmp/


Version-Release number of selected component (if applicable):
OSP 13 2019-01-22.1 puddle

Controller (RHEL 7.6):
[root@controller-0 ~]# rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-ovn-host-2.9.0-83.el7fdp.1.x86_64
openvswitch-2.9.0-83.el7fdp.1.x86_64
openvswitch-debuginfo-2.9.0-83.el7fdp.1.x86_64
openstack-neutron-openvswitch-12.0.5-3.el7ost.noarch
openvswitch-ovn-common-2.9.0-83.el7fdp.1.x86_64
openvswitch-ovn-central-2.9.0-83.el7fdp.1.x86_64
python-openvswitch-2.9.0-83.el7fdp.1.x86_64

[root@controller-0 ~]# sudo docker images | grep ovn
192.168.24.1:8787/rhosp13/openstack-neutron-server-ovn        2019-01-21.1        af2f2a07db50        8 days ago          867 MB
192.168.24.1:8787/rhosp13/openstack-ovn-northd                2019-01-21.1        fc759e84aca6        8 days ago          728 MB
192.168.24.1:8787/rhosp13/openstack-ovn-northd                pcmklatest          fc759e84aca6        8 days ago          728 MB
192.168.24.1:8787/rhosp13/openstack-nova-novncproxy           2019-01-21.1        b97fc5ca7fcc        12 days ago         819 MB
192.168.24.1:8787/rhosp13/openstack-ovn-controller            2019-01-21.1        ff46d4267134        12 days ago         602 MB

neutron_api container:
python-neutron-12.0.5-3.el7ost.noarch
openstack-neutron-lbaas-12.0.1-0.20181019202914.b9b6b6a.el7ost.noarch
python-neutron-lbaas-12.0.1-0.20181019202914.b9b6b6a.el7ost.noarch
openstack-neutron-fwaas-12.0.1-1.el7ost.noarch
python2-neutronclient-6.7.0-1.el7ost.noarch
puppet-neutron-12.4.1-3.ed05e01git.el7ost.noarch
python-neutron-fwaas-12.0.1-1.el7ost.noarch
openstack-neutron-12.0.5-3.el7ost.noarch
openstack-neutron-common-12.0.5-3.el7ost.noarch
openstack-neutron-ml2-12.0.5-3.el7ost.noarch
python2-neutron-lib-1.13.0-1.el7ost.noarch

ovn_controller container:
puppet-ovn-12.4.0-1.el7ost.noarch
openvswitch-ovn-common-2.9.0-83.el7fdp.1.x86_64
openvswitch-ovn-host-2.9.0-83.el7fdp.1.x86_64


How reproducible: always


Steps to Reproduce:
1. Install OSP 13 with Octavia and OVN (I'm using infrared)
(openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /home/stack/hybrid_templates/config_lvm.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/hybrid_templates/network/network-environment.yaml \
-e /home/stack/hybrid_templates/hostnames.yml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml \
-e /home/stack/hybrid_templates/debug.yaml \
-e /home/stack/hybrid_templates/docker-images.yaml \
-e /home/stack/hybrid_templates/prereq.yaml \
-e /home/stack/hybrid_templates/config_heat.yaml \
-e /home/stack/hybrid_templates/nodes_data.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \
-e /home/stack/hybrid_templates/ovn-extras.yaml \
--log-file overcloud_deployment_32.log)

2. Run OCP 3.11 installation playbooks:
   ansible-playbook --user openshift -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py" -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/provision.yml"

3. Check the provision playbook finishes successfully and the stack openshift.example.com is created in the overcloud.

Actual results:
The provision playbook does fail due to connectivity.

TASK [Wait for the the nodes to come up] **************************************************************************************************************************************************************************
ok: [master-0.openshift.example.com]
ok: [app-node-0.openshift.example.com]
ok: [infra-node-0.openshift.example.com]
fatal: [app-node-1.openshift.example.com]: FAILED! => {"changed": false, "elapsed": 7377, "msg": "timed out waiting for ping module test success: Failed to connect to the host via ssh: Warning: Permanently added
 '10.46.22.38' (ECDSA) to the list of known hosts.\r\nAuthentication failed.\r\n"}

Connectivity with the controller is lost, and openstack commands cannot be executed.


Expected results:
The provision ends successfully and the stack is created.


Additional info:
The ovs-openvswitchd service can be restarted by log-in through the console to the controller node.
For that the controller's root password must be changed, as it is requested during log-in via console.

Comment 1 Jon Uriarte 2019-01-31 14:14:52 UTC
ovs-vswitchd logs during the crash (find attached complete ovs-vswitchd.log file):

2019-01-31T11:27:40.741Z|00001|ofproto_dpif_xlate(handler34)|WARN|Invalid Geneve tunnel metadata while processing udp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:60:c5:f0,dl_dst=fa:16:3e:41:cd:ed,nw_src=192.168.99.4,nw_dst=10.46.22.50,nw_tos=0,nw_ecn=0,nw_ttl=63,tp_src=42305,tp_dst=53 on bridge br-int
2019-01-31T11:27:41.348Z|00001|ofproto_dpif_xlate(handler40)|WARN|Invalid Geneve tunnel metadata while processing tcp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:be:76:44,dl_dst=52:54:00:08:3e:d7,nw_src=172.16.40.10,nw_dst=10.46.22.24,nw_tos=0,nw_ecn=0,nw_ttl=63,tp_src=57364,tp_dst=5000,tcp_flags=ack on bridge br-int
2019-01-31T11:27:42.826Z|00001|ofproto_dpif_xlate(handler38)|WARN|Invalid Geneve tunnel metadata while processing tcp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:18:ea:9c,dl_dst=fa:16:3e:f0:7d:3f,nw_src=172.24.0.14,nw_dst=172.24.0.5,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9443,tp_dst=37734,tcp_flags=psh|ack on bridge br-int
2019-01-31T11:27:42.826Z|00001|ofproto_dpif_xlate(handler41)|WARN|Invalid Geneve tunnel metadata while processing tcp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:18:ea:9c,dl_dst=fa:16:3e:f0:7d:3f,nw_src=172.24.0.14,nw_dst=172.24.0.5,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9443,tp_dst=37734,tcp_flags=psh|ack on bridge br-int

Comment 2 Jon Uriarte 2019-01-31 14:15:29 UTC
Created attachment 1525373 [details]
ovs-vswitchd logs

Comment 3 Jon Uriarte 2019-01-31 14:18:44 UTC
It seems it was fixed [1] in ovs master and 2.10 branches, but not in 2.9


[1] https://patchwork.ozlabs.org/patch/979576/

Comment 4 Jon Uriarte 2019-02-01 10:10:23 UTC
Same behaviour in OSP 14, almost the same core dump (find it attached):

#0  tun_metadata_to_geneve__ (flow=flow@entry=0x7ffc8183c5d8, b=b@entry=0x7ffc81820288, crit_opt=crit_opt@entry=0x7ffc8181ff67) at ../lib/tun-metadata.c:676
#1  0x00005624864a8f6b in tun_metadata_to_geneve_nlattr_flow (b=0x7ffc81820288, flow=0x7ffc8183c590) at ../lib/tun-metadata.c:706
#2  tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7ffc8183c590, flow=flow@entry=0x7ffc8183c590, key=key@entry=0x0, b=b@entry=0x7ffc81820288) at ../lib/tun-metadata.c:810
#3  0x000056248642a4cc in tun_key_to_attr (a=a@entry=0x7ffc81820288, tun_key=tun_key@entry=0x7ffc8183c590, tun_flow_key=tun_flow_key@entry=0x7ffc8183c590, key_buf=key_buf@entry=0x0, tnl_type=<optimized out>, 
    tnl_type@entry=0x0) at ../lib/odp-util.c:2885
#4  0x0000562486435ce2 in odp_key_from_dp_packet (buf=buf@entry=0x7ffc81820288, packet=0x7ffc8183c480) at ../lib/odp-util.c:5908
#5  0x00005624864b8df0 in dpif_netlink_encode_execute (buf=0x7ffc81820288, d_exec=0x7ffc8183c0e8, dp_ifindex=<optimized out>) at ../lib/dpif-netlink.c:1742
#6  dpif_netlink_operate__ (dpif=dpif@entry=0x56248833ca50, ops=ops@entry=0x7ffc8183c0d8, n_ops=n_ops@entry=1) at ../lib/dpif-netlink.c:1828
#7  0x00005624864b95ce in dpif_netlink_operate_chunks (n_ops=1, ops=0x7ffc8183c0d8, dpif=<optimized out>) at ../lib/dpif-netlink.c:2120
#8  dpif_netlink_operate (dpif_=0x56248833ca50, ops=0x7ffc8183c0d8, n_ops=<optimized out>) at ../lib/dpif-netlink.c:2156
#9  0x00005624863fac53 in dpif_operate (dpif=0x56248833ca50, ops=ops@entry=0x7ffc8183c0d8, n_ops=n_ops@entry=1) at ../lib/dpif.c:1349
#10 0x00005624863fb438 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7ffc8183c170) at ../lib/dpif.c:1314
#11 0x00005624863a8951 in nxt_resume (ofproto_=0x56248835fea0, pin=0x7ffc8183cbe0) at ../ofproto/ofproto-dpif.c:5085
#12 0x00005624863946f6 in handle_nxt_resume (ofconn=ofconn@entry=0x5624882fe690, oh=oh@entry=0x562488326ce0) at ../ofproto/ofproto.c:3619
#13 0x00005624863a08cb in handle_openflow__ (msg=0x5624883262c0, ofconn=0x5624882fe690) at ../ofproto/ofproto.c:8167
#14 handle_openflow (ofconn=0x5624882fe690, ofp_msg=0x5624883262c0) at ../ofproto/ofproto.c:8288
#15 0x00005624863d2583 in ofconn_run (handle_openflow=0x5624863a0600 <handle_openflow>, ofconn=0x5624882fe690) at ../ofproto/connmgr.c:1446
#16 connmgr_run (mgr=0x562488329bd0, handle_openflow=handle_openflow@entry=0x5624863a0600 <handle_openflow>) at ../ofproto/connmgr.c:365
#17 0x000056248639a4ce in ofproto_run (p=0x56248835fea0) at ../ofproto/ofproto.c:1825
#18 0x0000562486387bac in bridge_run__ () at ../vswitchd/bridge.c:2944
#19 0x000056248638da08 in bridge_run () at ../vswitchd/bridge.c:3002
#20 0x00005624861cce15 in main (argc=12, argv=0x7ffc8183e088) at ../vswitchd/ovs-vswitchd.c:125

Version-Release number of selected component (if applicable):
OSP 14 2019-01-17.2 puddle

Controller (RHEL 7.6):
[root@controller-0 tmp]# rpm -qa | grep openvswitch
openvswitch2.10-2.10.0-28.el7fdp.1.x86_64
rhosp-openvswitch-2.10-0.1.el7ost.noarch
openvswitch2.10-debuginfo-2.10.0-28.el7fdp.1.x86_64
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch

[heat-admin@controller-0 ~]$ sudo docker images | grep ovn
192.168.24.1:8787/rhosp14/openstack-ovn-northd                2019-01-16.1        03bdcefcab6c        8 days ago          752 MB
192.168.24.1:8787/rhosp14/openstack-ovn-northd                pcmklatest          03bdcefcab6c        8 days ago          752 MB
192.168.24.1:8787/rhosp14/openstack-neutron-server-ovn        2019-01-16.1        29207e903690        8 days ago          920 MB
192.168.24.1:8787/rhosp14/openstack-nova-novncproxy           2019-01-16.1        9803d4995c22        2 weeks ago         880 MB
192.168.24.1:8787/rhosp14/openstack-ovn-controller            2019-01-16.1        7cde11f5e83e        2 weeks ago         626 MB


neutron_api container:
python-neutron-fwaas-13.0.1-0.20180910012802.3c6cfbc.el7ost.noarch
openstack-neutron-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
openstack-neutron-common-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
openstack-neutron-ml2-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
puppet-neutron-13.3.1-0.20181013115834.el7ost.noarch
python-neutron-13.0.2-0.20180929022427.c7f970c.el7ost.noarch
openstack-neutron-lbaas-13.0.1-0.20180831185310.e0cca6e.el7ost.noarch
python2-neutronclient-6.9.1-0.20180925041810.7eba94e.el7ost.noarch
python-neutron-lbaas-13.0.1-0.20180831185310.e0cca6e.el7ost.noarch
openstack-neutron-fwaas-13.0.1-0.20180910012802.3c6cfbc.el7ost.noarch
python2-neutron-lib-1.18.0-0.20180816094046.67865c7.el7ost.noarch


ovn_controller container:
puppet-ovn-13.3.1-0.20181013120724.38e2e33.el7ost.noarch
openvswitch2.10-ovn-common-2.10.0-28.el7fdp.1.x86_64
rhosp-openvswitch-ovn-host-2.10-0.1.el7ost.noarch
rhosp-openvswitch-ovn-common-2.10-0.1.el7ost.noarch
openvswitch2.10-ovn-host-2.10.0-28.el7fdp.1.x86_64

Comment 5 Jon Uriarte 2019-02-01 10:12:14 UTC
Created attachment 1525778 [details]
ovs-vswitchd core dump

Comment 6 Numan Siddique 2019-02-01 15:41:06 UTC
Looking into the backtrace, the patch [1] should fix the issue

[1] - https://github.com/openvswitch/ovs/commit/bed941ba0f14854683c241fa9bff3d49dd2efeee

Comment 7 Numan Siddique 2019-02-01 15:45:43 UTC
(In reply to Numan Siddique from comment #6)
> Looking into the backtrace, the patch [1] should fix the issue
> 
> [1] -
> https://github.com/openvswitch/ovs/commit/
> bed941ba0f14854683c241fa9bff3d49dd2efeee

Comment #3 already says that :)

Comment 14 Daniel Alvarez Sanchez 2019-02-06 13:12:11 UTC
@juriarte from shiftstack team confirmed that the crash doesn't happen with this version. Still working on validation
Please Jon, update this after your findings.

Comment 18 Jon Uriarte 2019-02-28 15:10:19 UTC
Waiting to [1] to be fixed, as it blocks Openshift installation.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1684077

Comment 19 Jon Uriarte 2019-03-01 15:22:34 UTC
Verified in OSP13 2019-02-25.2 puddle.

[stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
13  -p 2019-02-25.2

[heat-admin@controller-0 ~]$ rpm -qa | grep openvswitch
openvswitch-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-95.el7fdp.x86_64
openstack-neutron-openvswitch-12.0.5-4.el7ost.noarch
python-openvswitch-2.9.0-95.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-ovn-central-2.9.0-95.el7fdp.x86_64

[heat-admin@controller-0 ~]$ sudo grep -r service_plugins /var/lib/config-data/puppet-generated/neutron/etc/                                                                                                       
/var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf:#service_plugins =
/var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf:service_plugins=qos,ovn-router,trunk

[heat-admin@controller-0 ~]$ sudo docker images | grep ovn
192.168.24.1:8787/rhosp13/openstack-neutron-server-ovn        2019-02-24.1        9626dfbcacff        4 days ago          856 MB
192.168.24.1:8787/rhosp13/openstack-nova-novncproxy           2019-02-24.1        8e3ba946872f        4 days ago          808 MB
192.168.24.1:8787/rhosp13/openstack-ovn-controller            2019-02-24.1        5ff907fbf00f        4 days ago          590 MB
192.168.24.1:8787/rhosp13/openstack-ovn-northd                2019-02-24.1        2efa8840e5bf        4 days ago          716 MB
192.168.24.1:8787/rhosp13/openstack-ovn-northd                pcmklatest          2efa8840e5bf        4 days ago          716 MB

- neutron_api container:
[heat-admin@controller-0 ~]$ sudo docker exec -it neutron_api rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-2.9.0-95.el7fdp.x86_64
python-openvswitch-2.9.0-95.el7fdp.x86_64

- ovn_controller container:
[heat-admin@controller-0 ~]$ sudo docker exec -it ovn_controller rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch
openvswitch-2.9.0-95.el7fdp.x86_64
python-openvswitch-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-95.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-95.el7fdp.x86_64


Verification steps:

1. Deploy OSP 13 with octavia and ovn backend in a hybrid environment
  · Set in Jenkins job:
    ir_tripleo_overcloud_templates: 'ovn-extras,octavia'
    ir_tripleo_overcloud_network_backend:
        - 'geneve'
    ir_tripleo_overcloud_deploy_override_options: |-
        --config-heat ControllerExtraConfig.neutron::agents::l3::extensions="fip_qos" \
        --network-ovn yes \
        --extra-deploy-params='-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml'

   OSP installation must set - "service_plugins=qos,ovn-router,trunk" in neutron.conf

2. Deploy OCP 3.11 with Kuryr:
Note: A warkaround for [1] has been applied - "yum install openshift-ansible --enablerepo=rhelosp-rhel-7.6-server-opt"

   ansible-playbook --user openshift -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py" -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/provision.yml"

3. Check the provision playbook finishes successfully and the stack openshift.example.com is created in the overcloud.

(shiftstack) [cloud-user@ansible-host-0 ~]$ openstack stack list
+--------------------------------------+-----------------------+-----------------+----------------------+--------------+
| ID                                   | Stack Name            | Stack Status    | Creation Time        | Updated Time |
+--------------------------------------+-----------------------+-----------------+----------------------+--------------+
| 2fa65297-882c-4a2f-8b5f-dfe8aeca1711 | openshift.example.com | CREATE_COMPLETE | 2019-03-01T13:39:49Z | None         |
+--------------------------------------+-----------------------+-----------------+----------------------+--------------+

(shiftstack) [cloud-user@ansible-host-0 ~]$ openstack server list
+--------------------------------------+------------------------------------+--------+------------------------------------------------------------------------+---------------------------------------+-----------+
| ID                                   | Name                               | Status | Networks                                                               | Image                                 | Flavor    |
+--------------------------------------+------------------------------------+--------+------------------------------------------------------------------------+---------------------------------------+-----------+
| 080e3c8e-b644-489e-bc0c-97d1fc13b8b9 | infra-node-0.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.9, 10.46.22.113 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.node   |
| eb91ed29-a48b-49f8-8ce6-b32f4f8e5531 | master-0.openshift.example.com     | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.8, 10.46.22.103 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.master |
| 8092cf88-3623-47bc-98e5-2b01df2c3e8b | app-node-0.openshift.example.com   | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.23, 10.46.22.96 | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.node   |
| 96db78db-1c70-4d47-886b-d0f3461a5f22 | app-node-1.openshift.example.com   | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.4, 10.46.22.91  | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.node   |
| c89b10ce-148d-47ba-889e-5ba3c6228737 | ansible_host-0                     | ACTIVE | private_openshift=172.16.40.18, 10.46.22.105                           | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.small  |
| c5b6b356-42c0-456f-8be1-28e1090aedd2 | openshift_dns-0                    | ACTIVE | openshift_dns=192.168.23.13, 10.46.22.95                               | rhel-guest-image-7.6-210.x86_64.qcow2 | m1.small  |
+--------------------------------------+------------------------------------+--------+------------------------------------------------------------------------+---------------------------------------+-----------+


4. Check ovs-vswitchd service has not been restarted in the controller:

[heat-admin@controller-0 ~]$ sudo grep vswitchd /var/log/messages 
Mar  1 06:27:18 controller-0 ovs-ctl: Starting ovs-vswitchd [  OK  ]

[heat-admin@controller-0 ~]$ sudo grep -r "ovs-ctl: " /var/log/messages 
Mar  1 06:27:18 controller-0 ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning).
Mar  1 06:27:18 controller-0 ovs-ctl: Creating empty database /etc/openvswitch/conf.db [  OK  ]
Mar  1 06:27:18 controller-0 ovs-ctl: Starting ovsdb-server [  OK  ]
Mar  1 06:27:18 controller-0 ovs-ctl: Configuring Open vSwitch system IDs [  OK  ]
Mar  1 06:27:18 controller-0 ovs-ctl: Inserting openvswitch module [  OK  ]
Mar  1 06:27:18 controller-0 ovs-ctl: Starting ovs-vswitchd [  OK  ]
Mar  1 06:27:18 controller-0 ovs-ctl: Enabling remote OVSDB managers [  OK  ]

[heat-admin@controller-0 ~]$ sudo systemctl status ovs-vswitchd
● ovs-vswitchd.service - Open vSwitch Forwarding Unit
   Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled)
   Active: active (running) since vie 2019-03-01 11:27:18 UTC; 3h 50min ago
 Main PID: 4672 (ovs-vswitchd)
    Tasks: 18
   Memory: 147.8M
   CGroup: /system.slice/ovs-vswitchd.service
           └─4672 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --p...

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1684077

Comment 22 errata-xmlrpc 2019-03-14 13:28:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0552


Note You need to log in before you can comment on or make changes to this bug.