Bug 1761575

Summary: [RHEL 8] ovsdb-server doesn't apply the db server status change to all the json rpc sessions few times.
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Numan Siddique <nusiddiq>
Component: openvswitch2.12Assignee: Numan Siddique <nusiddiq>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 19.GCC: ctrautma, jhsiao, jishi, kfida, ovs-qe, ovs-team, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1761573 Environment:
Last Closed: 2019-12-11 12:06:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1761572, 1761573, 1761577    
Bug Blocks:    

Description Numan Siddique 2019-10-14 18:48:29 UTC
+++ This bug was initially created as a clone of Bug #1761573 +++

+++ This bug was initially created as a clone of Bug #1761572 +++

Description of problem:
In an OVN deployment when ovsdb-server failover happens, it can happen that some ovn-controllers connect to the ovsdb-server master in read-only mode. Once the ovsdb-servers are promoted to master, ideally ovn-controller should reconnect again and have read-write access to the db. But some times, the connection is not reset and these ovn-controller remain connected to the ovsdb-servers' in read-only mode. Because of which they cannot write to the SB db. This causes VM boot failures and mac_binding write failures.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Jianlin Shi 2019-11-18 03:45:42 UTC
reproducer:
#!/bin/bash


systemctl restart ovn-northd                                                                          
NB_PATH=$(ovn-nbctl --pidfile --detach)
ovs-appctl -t $NB_PATH run ls-add tst                                                                 
ovn-nbctl ls-del tst1                                                                                 
ovn-nbctl ls-del tst2                                                                                 
ovs-appctl -t /var/run/openvswitch/ovnnb_db.ctl  ovsdb-server/set-active-ovsdb-server tcp:192.0.0.2:6641
for i in {1..50}
do

ovs-appctl -t /var/run/openvswitch/ovnnb_db.ctl ovsdb-server/connect-active-ovsdb-server              
#ovs-appctl -t $NB_PATH run ls-add tst1                                                               
ovs-appctl -t /var/run/openvswitch/ovnnb_db.ctl ovsdb-server/disconnect-active-ovsdb-server           
if ! ovs-appctl -t $NB_PATH run ls-add tst2                                                           
then                                                                                                  
echo "fail"
break
fi
ovn-nbctl show                                                                                        
ovn-nbctl ls-del tst1                                                                                 
ovn-nbctl ls-del tst2

done

fail to reproduce on ovs2.11, but it is slow to run the reproducer:
real    6m26.828s                                                                                     
user    0m1.714s                                                                                      
sys     0m1.145s                                                                                      
[root@hp-dl380pg8-11 bz1761575]# rpm -qa | grep -E "openvswitch|ovn"                                  
openvswitch-selinux-extra-policy-1.0-19.el8fdp.noarch                                                 
ovn2.11-host-2.11.1-2.el8fdp.x86_64                                                                   
openvswitch2.11-2.11.0-25.el8fdp.x86_64                                                               
ovn2.11-central-2.11.1-2.el8fdp.x86_64                                                                
ovn2.11-2.11.1-2.el8fdp.x86_64

no problem on ovs2.12, and it is much faster to run:
real    0m3.427s                                                                                      
user    0m1.659s                                                                                      
sys     0m1.158s                                                                                      
[root@hp-dl380pg8-11 bz1761575]# rpm -qa | grep -E "openvswitch|ovn"                                  
openvswitch-selinux-extra-policy-1.0-19.el8fdp.noarch                                                 
openvswitch2.12-2.12.0-4.el8fdp.x86_64                                                                
ovn2.12-2.12.0-7.el8fdp.x86_64                                                                        
ovn2.12-host-2.12.0-7.el8fdp.x86_64                                                                   
ovn2.12-central-2.12.0-7.el8fdp.x86_64

set VERIFIED

Comment 4 errata-xmlrpc 2019-12-11 12:06:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4207