Bug 1803697

Summary: OVN: check membership status instead of DB status
Product: OpenShift Container Platform Reporter: Alexander Constantinescu <aconstan>
Component: NetworkingAssignee: Alexander Constantinescu <aconstan>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified    
Version: 4.4   
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1803701 (view as bug list) Environment:
Last Closed: 2020-05-13 21:58:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1803701    

Description Alexander Constantinescu 2020-02-17 09:35:02 UTC
Description of problem:

This is a placeholder for a solved bug with a defined PR

Today our OVN readinessProbe checks and reports what the cluster DB status is. They should be reporting on if each member is a part of the cluster or not. 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Anurag saxena 2020-03-03 16:57:25 UTC
Looks good to me on 4.4.0-0.nightly-2020-03-02-155711 under ovs-appctl outputs and every cluster member (IP) is a part of servers and depicts cluster membership status as well

# oc exec ovnkube-master-l6ppd -- ovs-appctl -t /var/run/ovn/ovnnb_db.ctl  cluster/status OVN_Northbound
Defaulting container name to northd.
Use 'oc describe pod/ovnkube-master-l6ppd -n openshift-ovn-kubernetes' to see all of the containers in this pod.
5992
Name: OVN_Northbound
Cluster ID: bd80 (bd80d0b5-3754-445c-9fae-6a31c4b8a2a1)
Server ID: 5992 (5992fd6d-8e4d-4ce9-aef0-108edde20cf1)
Address: ssl:10.0.0.6:9643
Status: cluster member
Role: leader
Term: 4
Leader: self
Vote: self

Election timer: 1000
Log: [2, 1870]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->200a ->7dbb <-200a <-7dbb
Servers:
    200a (200a at ssl:10.0.0.5:9643) next_index=1870 match_index=1869
    7dbb (7dbb at ssl:10.0.0.7:9643) next_index=1870 match_index=1869
    5992 (5992 at ssl:10.0.0.6:9643) (self) next_index=719 match_index=1869

# oc exec ovnkube-master-hb65f -- ovs-appctl -t /var/run/ovn/ovnnb_db.ctl  cluster/status OVN_Northbound
Defaulting container name to northd.
Use 'oc describe pod/ovnkube-master-hb65f -n openshift-ovn-kubernetes' to see all of the containers in this pod.
200a
Name: OVN_Northbound
Cluster ID: bd80 (bd80d0b5-3754-445c-9fae-6a31c4b8a2a1)
Server ID: 200a (200a8e40-ea35-4fc7-b37e-5a635e954201)
Address: ssl:10.0.0.5:9643
Status: cluster member
Role: follower
Term: 4
Leader: 5992
Vote: 5992

Election timer: 1000
Log: [2, 1870]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->5992 ->7dbb <-5992 <-7dbb
Servers:
    200a (200a at ssl:10.0.0.5:9643) (self)
    5992 (5992 at ssl:10.0.0.6:9643)
    7dbb (7dbb at ssl:10.0.0.7:9643)

# oc exec ovnkube-master-9bdrt -- ovs-appctl -t /var/run/ovn/ovnnb_db.ctl  cluster/status OVN_Northbound
Defaulting container name to northd.
Use 'oc describe pod/ovnkube-master-9bdrt -n openshift-ovn-kubernetes' to see all of the containers in this pod.
7dbb
Name: OVN_Northbound
Cluster ID: bd80 (bd80d0b5-3754-445c-9fae-6a31c4b8a2a1)
Server ID: 7dbb (7dbb6801-f01e-4c3c-9a77-c0c6c45f2213)
Address: ssl:10.0.0.7:9643
Status: cluster member
Role: follower
Term: 4
Leader: 5992
Vote: unknown

Election timer: 1000
Log: [2, 1870]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->200a ->5992 <-5992 <-200a
Servers:
    200a (200a at ssl:10.0.0.5:9643)
    7dbb (7dbb at ssl:10.0.0.7:9643) (self)        
    5992 (5992 at ssl:10.0.0.6:9643)

Comment 5 errata-xmlrpc 2020-05-13 21:58:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581