Bug 1684363

Summary: [ovn_cluster]master node can't be up after restart openvswitch
Product: Red Hat Enterprise Linux 7 Reporter: haidong li <haili>
Component: openvswitchAssignee: Dumitru Ceara <dceara>
Status: CLOSED WONTFIX QA Contact: haidong li <haili>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: aloughla, atragler, ctrautma, dcbw, fhallal, nusiddiq, ovs-qe, qding, rkhan, tredaelli
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1723291 (view as bug list) Environment:
Last Closed: 2019-09-30 09:33:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1723291    

Description haidong li 2019-03-01 05:08:59 UTC
Description of problem:
master node can't be up after restart openvswitch

Version-Release number of selected component (if applicable):
[root@hp-dl380pg8-16 ~]# uname -a
Linux hp-dl380pg8-16.rhts.eng.pek2.redhat.com 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@hp-dl380pg8-16 ~]# rpm -qa |grep openvswitch
kernel-kernel-networking-openvswitch-ovn_ha-1.0-30.noarch
openvswitch-ovn-common-2.9.0-97.el7fdp.x86_64
openvswitch-2.9.0-97.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-10.el7fdp.noarch
openvswitch-ovn-host-2.9.0-97.el7fdp.x86_64
openvswitch-ovn-central-2.9.0-97.el7fdp.x86_64

How reproducible:
everytime

Steps to Reproduce:
1.set up cluster with 3 nodes as ovndb_servers
2.restart openvswitch on master node

[root@hp-dl388g8-02 ~]# pcs status
Cluster name: my_cluster

WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)

Stack: corosync
Current DC: hp-dl380pg8-16.rhts.eng.pek2.redhat.com (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Feb 28 21:18:24 2019
Last change: Thu Feb 28 09:02:19 2019 by root via crm_attribute on hp-dl388g8-02.rhts.eng.pek2.redhat.com

3 nodes configured
4 resources configured

Online: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com hp-dl388g8-02.rhts.eng.pek2.redhat.com hp-dl388g8-19.rhts.eng.pek2.redhat.com ]

Full list of resources:

 ip-70.0.0.50    (ocf::heartbeat:IPaddr2):    Started hp-dl388g8-02.rhts.eng.pek2.redhat.com
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
     Masters: [ hp-dl388g8-02.rhts.eng.pek2.redhat.com ]
     Slaves: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com hp-dl388g8-19.rhts.eng.pek2.redhat.com ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[root@hp-dl388g8-02 ~]# systemctl restart openvswitch
[root@hp-dl388g8-02 ~]# pcs status
Cluster name: my_cluster

WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)

Stack: corosync
Current DC: hp-dl380pg8-16.rhts.eng.pek2.redhat.com (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Feb 28 21:30:59 2019
Last change: Thu Feb 28 21:19:00 2019 by root via crm_attribute on hp-dl388g8-19.rhts.eng.pek2.redhat.com

3 nodes configured
4 resources configured

Online: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com hp-dl388g8-02.rhts.eng.pek2.redhat.com hp-dl388g8-19.rhts.eng.pek2.redhat.com ]

Full list of resources:

 ip-70.0.0.50    (ocf::heartbeat:IPaddr2):    Started hp-dl388g8-19.rhts.eng.pek2.redhat.com
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
     Masters: [ hp-dl388g8-19.rhts.eng.pek2.redhat.com ]
     Slaves: [ hp-dl380pg8-16.rhts.eng.pek2.redhat.com ]
     Stopped: [ hp-dl388g8-02.rhts.eng.pek2.redhat.com ]

Failed Actions:
* ovndb_servers_demote_0 on hp-dl388g8-02.rhts.eng.pek2.redhat.com 'not running' (7): call=42, status=complete, exitreason='',
    last-rc-change='Thu Feb 28 21:18:59 2019', queued=0ms, exec=40ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Actual results:
the node can't be back after restart openvswitch

Expected results:
the node can be back after restart openvswitch

Additional info: