Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1760763

Summary: [ovsdb-server] Allow replicating from older schema servers
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Daniel Alvarez Sanchez <dalvarez>
Component: openvswitch2.11Assignee: Numan Siddique <nusiddiq>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: medium Docs Contact:
Priority: unspecified    
Version: RHEL 8.0CC: ctrautma, fleitner, jhsiao, jishi, kfida, ralongi, sathlang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1766586 1771854 (view as bug list) Environment:
Last Closed: 2020-01-22 04:02:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1759974, 1766586, 1771854, 1775795    

Description Daniel Alvarez Sanchez 2019-10-11 09:32:51 UTC
When we have a pacemaker based cluster of ovsdb-servers and we try to upgrade them one by one, there can be situations where the DB schema is updated.

Right now, ovsdb-server backup instances connecting to a master ovsdb-server with an older schema will refuse to replicate and this may have issues to get the pacemaker cluster back to a stable state.

The purpose of this BZ is to allow ovsdb-server to replicate from a master instance which is on an older schema. Since backwards compatibility of the schema is guaranteed, this should have no side effect and ensures consistency.

Comment 1 Numan Siddique 2019-11-04 13:15:05 UTC
The fix is available in openvswitch2.11-2.11.0-27-el8fdn for now.
It should be available in next FDP release.

Comment 3 Jianlin Shi 2019-12-18 02:46:13 UTC
reproduced on openvswitch2.11-2.11.0-26:
:: [ 21:27:37 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl ls-add ls1'                                                                                            
2019-12-18T02:27:37Z|00002|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:37Z|00003|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:37Z|00004|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
2019-12-18T02:27:37Z|00005|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:37Z|00006|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:37Z|00007|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
:: [ 21:27:37 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl ls-add ls1' (Expected 0, got 0)                                                                        
:: [ 21:27:37 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl show | grep ls1'                                                                                       
2019-12-18T02:27:38Z|00001|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:38Z|00002|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:38Z|00003|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
2019-12-18T02:27:38Z|00004|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:38Z|00005|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:38Z|00006|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
switch db3691ab-9ac5-44ea-b30b-4cb1cc74f69e (ls1)                                                                                                                                                           
:: [ 21:27:38 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl show | grep ls1' (Expected 0, got 0)                                                                   
:: [ 21:27:38 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs cluster stop 20.0.30.26'                                                                                     
20.0.30.26: Stopping Cluster (pacemaker)...                                                                                                                                                                 
20.0.30.26: Stopping Cluster (corosync)...                                                                                                                                                                  
:: [ 21:27:47 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs cluster stop 20.0.30.26' (Expected 0, got 0)                                                                 
:: [ 21:27:47 ] :: [  BEGIN   ] :: Running 'sleep 5'                                                                                                                                                        
:: [ 21:27:52 ] :: [   PASS   ] :: Command 'sleep 5' (Expected 0, got 0)                                                                                                                                    
:: [ 21:27:52 ] :: [  BEGIN   ] :: Running 'pcs status'                                                                                                                                                     
Cluster name: my_cluster                                                                                                                                                                                    
Stack: corosync                                                                                                                                                                                             
Current DC: 20.0.30.25 (version 2.0.2-3.el8-744a30d655) - partition with quorum                                                                                                                             
Last updated: Tue Dec 17 21:27:52 2019                                                                                                                                                                      
Last change: Tue Dec 17 21:27:43 2019 by root via crm_attribute on 20.0.30.25                                                                                                                               
                                                                                                                                                                                                            
2 nodes configured                                                                                                                                                                                          
3 resources configured                                                                                                                                                                                      
                                                                                                                                                                                                            
Online: [ 20.0.30.25 ]                                                                                                                                                                                      
OFFLINE: [ 20.0.30.26 ]                                                                                                                                                                                     
                                                                                                                                                                                                            
Full list of resources:                                                                                                                                                                                     
                                                                                                                                                                                                            
 ip-20.0.30.50  (ocf::heartbeat:IPaddr2):       Started 20.0.30.25                                                                                                                                          
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)                                                                                                                                                
     Masters: [ 20.0.30.25 ]                                                                                                                                                                                
     Stopped: [ 20.0.30.26 ]                                                                                                                                                                                
                                                                                                                                                                                                            
Daemon Status:                                                                                                                                                                                              
  corosync: active/enabled                                                                                                                                                                                  
  pacemaker: active/enabled                                                                                                                                                                                 
  pcsd: active/enabled                                                                                                                                                                                      
:: [ 21:27:52 ] :: [   PASS   ] :: Command 'pcs status' (Expected 0, got 0)
:: [ 21:27:52 ] :: [  BEGIN   ] :: Running 'ovn-nbctl show | grep ls1'                                                                                                                                      
2019-12-18T02:27:52Z|00001|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:52Z|00002|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:52Z|00003|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
2019-12-18T02:27:52Z|00004|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:52Z|00005|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:52Z|00006|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
switch db3691ab-9ac5-44ea-b30b-4cb1cc74f69e (ls1)                                                                                                                                                           
:: [ 21:27:52 ] :: [   PASS   ] :: Command 'ovn-nbctl show | grep ls1' (Expected 0, got 0)                                                                                                                  
:: [ 21:27:52 ] :: [  BEGIN   ] :: Running 'ovn-nbctl ls-add ls2'                                                                                                                                           
2019-12-18T02:27:52Z|00002|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:52Z|00003|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:52Z|00004|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
2019-12-18T02:27:52Z|00005|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:52Z|00006|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:52Z|00007|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
:: [ 21:27:52 ] :: [   PASS   ] :: Command 'ovn-nbctl ls-add ls2' (Expected 0, got 0)                                                                                                                       
:: [ 21:27:52 ] :: [  BEGIN   ] :: Running 'ovn-nbctl show | grep ls2'                                                                                                                                      
2019-12-18T02:27:52Z|00001|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:52Z|00002|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:52Z|00003|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
2019-12-18T02:27:52Z|00004|ovsdb_idl|WARN|Logical_Router table in OVN_Northbound database lacks policies column (database needs upgrade?)                                                                   
2019-12-18T02:27:52Z|00005|ovsdb_idl|WARN|OVN_Northbound database lacks Logical_Router_Policy table (database needs upgrade?)                                                                               
2019-12-18T02:27:52Z|00006|ovsdb_idl|WARN|Logical_Switch_Port table in OVN_Northbound database lacks ha_chassis_group column (database needs upgrade?)                                                      
switch 1d2fb8da-0b27-47b4-b685-d0560399536d (ls2)                                                                                                                                                           
:: [ 21:27:52 ] :: [   PASS   ] :: Command 'ovn-nbctl show | grep ls2' (Expected 0, got 0) 

:: [ 21:27:52 ] :: [  BEGIN   ] :: Running 'scp 5.16.0/* hp-dl380pg8-12.rhts.eng.pek2.redhat.com:/usr/share/openvswitch/'                                                                                   
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                                                                                                                                                 
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @                                                                                                                                                 
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                                                                                                                                                 
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!                                                                                                                                                       
Someone could be eavesdropping on you right now (man-in-the-middle attack)!                                                                                                                                 
It is also possible that a host key has just been changed.                                                                                                                                                  
The fingerprint for the ECDSA key sent by the remote host is                                                                                                                                                
SHA256:hm17otUQa0XhnLmxOm5mIqnXs4YHVaIMZzq5rDcLDcc.                                                                                                                                                         
Please contact your system administrator.                                                                                                                                                                   
Add correct host key in no to get rid of this message.                                                                                                                                                      
Offending ECDSA key in no:2                                                                                                                                                                                 
Password authentication is disabled to avoid man-in-the-middle attacks.                                                                                                                                     
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.                                                                                                                         
ovn-nb.ovsschema                                                   100%   23KB   2.0MB/s   00:00                                                                                                            
ovn-sb.ovsschema                                                   100%   20KB   3.8MB/s   00:00                                                                                                            
:: [ 21:27:53 ] :: [   PASS   ] :: Command 'scp 5.16.0/* hp-dl380pg8-12.rhts.eng.pek2.redhat.com:/usr/share/openvswitch/' (Expected 0, got 0)                                                               
:: [ 21:27:53 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs cluster start 20.0.30.26'                                                                                    
20.0.30.26: Starting Cluster...                                                                                                                                                                             
:: [ 21:27:57 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs cluster start 20.0.30.26' (Expected 0, got 0)                                                                
:: [ 21:27:57 ] :: [  BEGIN   ] :: Running 'sleep 5'                                                                                                                                                        
:: [ 21:28:02 ] :: [   PASS   ] :: Command 'sleep 5' (Expected 0, got 0)                                                                                                                                    
:: [ 21:28:02 ] :: [  BEGIN   ] :: Running 'pcs cluster stop 20.0.30.25 || pcs cluster stop dell-per740-12.rhts.eng.pek2.redhat.com'                                                                        
20.0.30.25: Stopping Cluster (pacemaker)...                                                                                                                                                                 
20.0.30.25: Stopping Cluster (corosync)...                                                                                                                                                                  
:: [ 21:28:08 ] :: [   PASS   ] :: Command 'pcs cluster stop 20.0.30.25 || pcs cluster stop dell-per740-12.rhts.eng.pek2.redhat.com' (Expected 0, got 0)                                                    
:: [ 21:28:08 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs status'                                                                                                      
Cluster name: my_cluster                                                                                                                                                                                    
Stack: corosync                                                                                                                                                                                             
Current DC: 20.0.30.26 (version 2.0.2-3.el8-744a30d655) - partition with quorum
Last updated: Tue Dec 17 21:28:09 2019
Last change: Tue Dec 17 21:28:05 2019 by root via crm_attribute on 20.0.30.26

2 nodes configured
3 resources configured

Online: [ 20.0.30.26 ]
OFFLINE: [ 20.0.30.25 ]

Full list of resources:

 ip-20.0.30.50  (ocf::heartbeat:IPaddr2):       Started 20.0.30.26
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)
     Masters: [ 20.0.30.26 ]
     Stopped: [ 20.0.30.25 ]
     
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
:: [ 21:28:09 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs status' (Expected 0, got 0)
:: [ 21:28:09 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl show | grep ls2'
:: [ 21:28:09 ] :: [   FAIL   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl show | grep ls2' (Expected 0, got 1)

<==== not replicated

Verified on openvswitch2.11-2.11.0-35.el8fdp.x86_64:

:: [ 21:42:54 ] :: [   PASS   ] :: Command 'scp 5.16.0/* hp-dl380pg8-12.rhts.eng.pek2.redhat.com:/usr/share/openvswitch/' (Expected 0, got 0)
:: [ 21:42:54 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs cluster start 20.0.30.26'
20.0.30.26: Starting Cluster...                                                         
:: [ 21:42:58 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs cluster start 20.0.30.26' (Expected 0, got 0)
:: [ 21:42:58 ] :: [  BEGIN   ] :: Running 'sleep 5'                                                                      
:: [ 21:43:03 ] :: [   PASS   ] :: Command 'sleep 5' (Expected 0, got 0)                                             
:: [ 21:43:03 ] :: [  BEGIN   ] :: Running 'pcs cluster stop 20.0.30.25 || pcs cluster stop dell-per740-12.rhts.eng.pek2.redhat.com'
20.0.30.25: Stopping Cluster (pacemaker)...                                                                                              
20.0.30.25: Stopping Cluster (corosync)...                            
:: [ 21:43:09 ] :: [   PASS   ] :: Command 'pcs cluster stop 20.0.30.25 || pcs cluster stop dell-per740-12.rhts.eng.pek2.redhat.com' (Expected 0, got 0)
:: [ 21:43:09 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs status'
Cluster name: my_cluster                   
Stack: corosync                                                                                                                                                                                            
Current DC: 20.0.30.26 (version 2.0.2-3.el8-744a30d655) - partition with quorum            
Last updated: Tue Dec 17 21:43:10 2019                                                                                                                                                                     
Last change: Tue Dec 17 21:43:06 2019 by root via crm_attribute on 20.0.30.26              
                                                                                                                                                                                                           
2 nodes configured                                                                                      
3 resources configured                                                           
                              
Online: [ 20.0.30.26 ]         
OFFLINE: [ 20.0.30.25 ] 

Full list of resources:                                                         
                                                                                 
 ip-20.0.30.50  (ocf::heartbeat:IPaddr2):       Started 20.0.30.26              
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)                    
     Masters: [ 20.0.30.26 ]                                                          
     Stopped: [ 20.0.30.25 ]                                                          
                                                                                
Daemon Status:     
  corosync: active/enabled                                                  
  pacemaker: active/enabled                                                                                                                                                                                
  pcsd: active/enabled                       
:: [ 21:43:10 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com pcs status' (Expected 0, got 0)
:: [ 21:43:10 ] :: [  BEGIN   ] :: Running 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl show | grep ls2'
switch 1045b43f-8829-44f6-94d0-4dc00118c057 (ls2)    
:: [ 21:43:10 ] :: [   PASS   ] :: Command 'ssh -q hp-dl380pg8-12.rhts.eng.pek2.redhat.com ovn-nbctl show | grep ls2' (Expected 0, got 0)

[root@dell-per740-12 bz1771854_replicate_old_schema]# rpm -qa | grep openvswitch
kernel-kernel-networking-openvswitch-ovn-basic-1.0-14.noarch
openvswitch-selinux-extra-policy-1.0-19.el8fdp.noarch
kernel-kernel-networking-openvswitch-ovn-common-1.0-6.noarch
kernel-kernel-networking-openvswitch-ovn-regression-bz1771854_replicate_old_schema-1.0-2.noarch
openvswitch2.11-2.11.0-35.el8fdp.x86_64

Comment 5 errata-xmlrpc 2020-01-22 04:02:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:0171