Bug 1857387

Summary: nbdb/sbdb pod should not reset the manually set election-timer to default during restart.
Product: OpenShift Container Platform Reporter: Anil Vishnoi <avishnoi>
Component: NetworkingAssignee: Anil Vishnoi <avishnoi>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: medium CC: bbennett
Version: 4.5   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-03 13:42:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anil Vishnoi 2020-07-15 18:49:13 UTC
Description of problem:
Currently nbdb/sbdb election-timer value is set to 5 second (from default 1 second) during the pod startup. After pods are in running state, and user manually changes the value of election-timer it's stored in the db. But when the pod restarts, it actually override the election-timer to default 5 second again, ideally it should not reset the value if the election-timer value is manually overridden.

Version-Release number of selected component (if applicable):


How reproducible:

Very easy to reproduce.


Steps to Reproduce:
*.* Deploy ovn-kubernetes 
*.* Change the nbdb/sbdb election-timer to some value except default value of 5 second (e.g 10 seconds)
*.* Restart the nbdb/sbdb pod. It will reset the value from 10 second to 5 second. Following log message will be logged in the nb/sb db log file

```2020-07-10T14:58:29Z|00025|raft|INFO|Election timer changed from 16000 to 5000```

Actual results:
Restart resets the election-timer value to default value.

Expected results:
It should restart the pods with the non-default value (if configured)

Additional info:

Comment 4 Anurag saxena 2020-08-31 19:19:40 UTC
There is a reverted PR on this https://github.com/openshift/cluster-network-operator/pull/769 which presumably causing https://bugzilla.redhat.com/show_bug.cgi?id=1872098. There might be other reasons which are still being investigated behing OVN cluster failures. Moving this to assigned for now, awaiting resolution on PR

Comment 5 Ben Bennett 2020-09-03 13:42:37 UTC
We do not support people editing the NBDB directly.  Don't shoot yourself in the foot.