Bug 1779460
| Summary: | Failure to create 3-node OVSDB raft cluster | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Russell Bryant <rbryant> |
| Component: | Networking | Assignee: | Ben Bennett <bbennett> |
| Networking sub component: | ovn-kubernetes | QA Contact: | zhaozhanqi <zzhao> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | aconstan, bbennett, weliang, zzhao |
| Version: | 4.4 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.4.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-04 11:18:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1779381 | ||
| Bug Blocks: | |||
|
Description
Russell Bryant
2019-12-04 02:52:55 UTC
The question: is this a bug in the shell script, or is this a bug in ovsd argument parsing? Hi Could QE re-validate is this bug is still a bug (I am ready to bet an arm and a leg that it is not....)? With all the upstreams improvements to OVN, I suspect this issue can be closed. Excuse us for this unconventional way of doing this, but it has slipped everyone's mind this past month. /Alex Tested on three master nodes cluster using 4.4.0-0.nightly-2020-01-24-141203 , the role information show one Leader and two followers
[root@dhcp-41-193 FILE]# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.4.0-0.nightly-2020-01-24-141203 True False 31m Cluster version is 4.4.0-0.nightly-2020-01-24-141203
[root@dhcp-41-193 FILE]# for n in $(oc get pods -n openshift-ovn-kubernetes | grep -v NAME | grep ovnkube-master | cut -f1 -d' ') ; do echo "**** $n ****" ; oc exec -n openshift-ovn-kubernetes -it $n -c nbdb -- ovs-appctl -t /var/run/openvswitch/ovnnb_db.ctl cluster/status OVN_Northbound ; done
**** ovnkube-master-npr5w ****
b33d
Name: OVN_Northbound
Cluster ID: 5cab (5caba8a9-c72f-4d00-a898-d190012c1b3b)
Server ID: b33d (b33d9d42-e282-4bc1-9867-a56f477ae351)
Address: ssl:10.0.134.27:9643
Status: cluster member
Role: leader
Term: 1
Leader: self
Vote: self
Election timer: 1000
Log: [2, 1599]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: <-e21e ->e21e <-ff13 ->ff13
Servers:
e21e (e21e at ssl:10.0.170.46:9643) next_index=1599 match_index=1598
ff13 (ff13 at ssl:10.0.150.83:9643) next_index=1599 match_index=1598
b33d (b33d at ssl:10.0.134.27:9643) (self) next_index=2 match_index=1598
**** ovnkube-master-x9hf8 ****
e21e
Name: OVN_Northbound
Cluster ID: 5cab (5caba8a9-c72f-4d00-a898-d190012c1b3b)
Server ID: e21e (e21e5144-6448-4090-a2d2-06b2a5e2014e)
Address: ssl:10.0.170.46:9643
Status: cluster member
Role: follower
Term: 1
Leader: b33d
Vote: unknown
Election timer: 1000
Log: [2, 1599]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 <-b33d <-ff13 ->ff13
Servers:
e21e (e21e at ssl:10.0.170.46:9643) (self)
ff13 (ff13 at ssl:10.0.150.83:9643)
b33d (b33d at ssl:10.0.134.27:9643)
**** ovnkube-master-zjdnn ****
ff13
Name: OVN_Northbound
Cluster ID: 5cab (5caba8a9-c72f-4d00-a898-d190012c1b3b)
Server ID: ff13 (ff1342d9-76a8-4b53-812a-14a00ca5e18b)
Address: ssl:10.0.150.83:9643
Status: cluster member
Role: follower
Term: 1
Leader: b33d
Vote: unknown
Election timer: 1000
Log: [2, 1599]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 ->e21e <-b33d <-e21e
Servers:
e21e (e21e at ssl:10.0.170.46:9643)
ff13 (ff13 at ssl:10.0.150.83:9643) (self)
b33d (b33d at ssl:10.0.134.27:9643)
[root@dhcp-41-193 FILE]#
Note that when I was seeing this, it did not happen on every install. It was just one occasional failure mode. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |