Bug 1684363

Summary:	[ovn_cluster]master node can't be up after restart openvswitch
Product:	Red Hat Enterprise Linux 7	Reporter:	haidong li <haili>
Component:	openvswitch	Assignee:	Dumitru Ceara <dceara>
Status:	CLOSED WONTFIX	QA Contact:	haidong li <haili>
Severity:	high	Docs Contact:
Priority:	high
Version:	7.6	CC:	aloughla, atragler, ctrautma, dcbw, fhallal, nusiddiq, ovs-qe, qding, rkhan, tredaelli
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1723291 (view as bug list)		Environment:
Last Closed:	2019-09-30 09:33:46 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1723291

Description haidong li 2019-03-01 05:08:59 UTC

Description of problem:
master node can't be up after restart openvswitch

Version-Release number of selected component (if applicable):
[root@hp-dl380pg8-16 ~]# uname -a
Linux hp-dl380pg8-16.rhts.eng.pek2.redhat.com 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@hp-dl380pg8-16 ~]# rpm -qa |grep openvswitch
kernel-kernel-networking-openvswitch-ovn_ha-1.0-30.noarch
openvswitch-ovn-common-2.9.0-97.el7fdp.x86_64
openvswitch-2.9.0-97.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-10.el7fdp.noarch
openvswitch-ovn-host-2.9.0-97.el7fdp.x86_64
openvswitch-ovn-central-2.9.0-97.el7fdp.x86_64

How reproducible:
everytime

Steps to Reproduce:
1.set up cluster with 3 nodes as ovndb_servers
2.restart openvswitch on master node

[root@hp-dl388g8-02 ~]# pcs status
Cluster name: my_cluster

WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)

Stack: corosync
Current DC: hp-dl380pg8-16.rhts.eng.pek2.redhat.com (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Feb 28 21:18:24 2019
Last change: Thu Feb 28 09:02:19 2019 by root via crm_attribute on hp-dl388g8-02.rhts.eng.pek2.redhat.com

3 nodes configured
4 resources configured

Online: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com hp-dl388g8-02.rhts.eng.pek2.redhat.com hp-dl388g8-19.rhts.eng.pek2.redhat.com ]

Full list of resources:

 ip-70.0.0.50    (ocf::heartbeat:IPaddr2):    Started hp-dl388g8-02.rhts.eng.pek2.redhat.com
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
     Masters: [ hp-dl388g8-02.rhts.eng.pek2.redhat.com ]
     Slaves: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com hp-dl388g8-19.rhts.eng.pek2.redhat.com ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[root@hp-dl388g8-02 ~]# systemctl restart openvswitch
[root@hp-dl388g8-02 ~]# pcs status
Cluster name: my_cluster

WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)

Stack: corosync
Current DC: hp-dl380pg8-16.rhts.eng.pek2.redhat.com (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Feb 28 21:30:59 2019
Last change: Thu Feb 28 21:19:00 2019 by root via crm_attribute on hp-dl388g8-19.rhts.eng.pek2.redhat.com

3 nodes configured
4 resources configured

Online: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com hp-dl388g8-02.rhts.eng.pek2.redhat.com hp-dl388g8-19.rhts.eng.pek2.redhat.com ]

Full list of resources:

 ip-70.0.0.50    (ocf::heartbeat:IPaddr2):    Started hp-dl388g8-19.rhts.eng.pek2.redhat.com
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
     Masters: [ hp-dl388g8-19.rhts.eng.pek2.redhat.com ]
     Slaves: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com ]
     Stopped: [ hp-dl388g8-02.rhts.eng.pek2.redhat.com ]

Failed Actions:
* ovndb_servers_demote_0 on hp-dl388g8-02.rhts.eng.pek2.redhat.com 'not running' (7): call=42, status=complete, exitreason='',
    last-rc-change='Thu Feb 28 21:18:59 2019', queued=0ms, exec=40ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Actual results:
the node can't be back after restart openvswitch

Expected results:
the node can be back after restart openvswitch

Additional info: