Bug 1779460 - Failure to create 3-node OVSDB raft cluster
Summary: Failure to create 3-node OVSDB raft cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.4.0
Assignee: Ben Bennett
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On: 1779381
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-04 02:52 UTC by Russell Bryant
Modified: 2020-05-04 11:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-04 11:18:30 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:18:49 UTC

Description Russell Bryant 2019-12-04 02:52:55 UTC
I filed https://bugzilla.redhat.com/show_bug.cgi?id=1779381 against OVN, but the bug occurred with ovn-kubernetes, so I thought it would be worth having a tracker against ovn-kubernetes to follow the issue to resolution.

Comment 1 Casey Callendrello 2019-12-04 14:08:06 UTC
The question: is this a bug in the shell script, or is this a bug in ovsd argument parsing?

Comment 2 Alexander Constantinescu 2020-01-27 16:27:48 UTC
Hi 

Could QE re-validate is this bug is still a bug (I am ready to bet an arm and a leg that it is not....)? With all the upstreams improvements to OVN, I suspect this issue can be closed. 

Excuse us for this unconventional way of doing this, but it has slipped everyone's mind this past month.

/Alex

Comment 3 Weibin Liang 2020-01-28 19:57:58 UTC
Tested on three master nodes cluster using 4.4.0-0.nightly-2020-01-24-141203 , the role information show one Leader and two followers

[root@dhcp-41-193 FILE]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-01-24-141203   True        False         31m     Cluster version is 4.4.0-0.nightly-2020-01-24-141203

[root@dhcp-41-193 FILE]# for n in $(oc get pods -n openshift-ovn-kubernetes | grep -v NAME | grep ovnkube-master | cut -f1 -d' ') ; do echo "**** $n ****" ; oc exec -n openshift-ovn-kubernetes  -it $n -c nbdb -- ovs-appctl -t /var/run/openvswitch/ovnnb_db.ctl cluster/status OVN_Northbound  ; done
**** ovnkube-master-npr5w ****
b33d
Name: OVN_Northbound
Cluster ID: 5cab (5caba8a9-c72f-4d00-a898-d190012c1b3b)
Server ID: b33d (b33d9d42-e282-4bc1-9867-a56f477ae351)
Address: ssl:10.0.134.27:9643
Status: cluster member
Role: leader
Term: 1
Leader: self
Vote: self

Election timer: 1000
Log: [2, 1599]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: <-e21e ->e21e <-ff13 ->ff13
Servers:
    e21e (e21e at ssl:10.0.170.46:9643) next_index=1599 match_index=1598
    ff13 (ff13 at ssl:10.0.150.83:9643) next_index=1599 match_index=1598
    b33d (b33d at ssl:10.0.134.27:9643) (self) next_index=2 match_index=1598
**** ovnkube-master-x9hf8 ****
e21e
Name: OVN_Northbound
Cluster ID: 5cab (5caba8a9-c72f-4d00-a898-d190012c1b3b)
Server ID: e21e (e21e5144-6448-4090-a2d2-06b2a5e2014e)
Address: ssl:10.0.170.46:9643
Status: cluster member
Role: follower
Term: 1
Leader: b33d
Vote: unknown

Election timer: 1000
Log: [2, 1599]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 <-b33d <-ff13 ->ff13
Servers:
    e21e (e21e at ssl:10.0.170.46:9643) (self)
    ff13 (ff13 at ssl:10.0.150.83:9643)
    b33d (b33d at ssl:10.0.134.27:9643)
**** ovnkube-master-zjdnn ****
ff13
Name: OVN_Northbound
Cluster ID: 5cab (5caba8a9-c72f-4d00-a898-d190012c1b3b)
Server ID: ff13 (ff1342d9-76a8-4b53-812a-14a00ca5e18b)
Address: ssl:10.0.150.83:9643
Status: cluster member
Role: follower
Term: 1
Leader: b33d
Vote: unknown

Election timer: 1000
Log: [2, 1599]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 ->e21e <-b33d <-e21e
Servers:
    e21e (e21e at ssl:10.0.170.46:9643)
    ff13 (ff13 at ssl:10.0.150.83:9643) (self)
    b33d (b33d at ssl:10.0.134.27:9643)
[root@dhcp-41-193 FILE]#

Comment 4 Russell Bryant 2020-01-29 15:55:52 UTC
Note that when I was seeing this, it did not happen on every install.  It was just one occasional failure mode.

Comment 6 errata-xmlrpc 2020-05-04 11:18:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.