Bug 1843512

Summary: ovndb-server ocf resource agent does not work with ipv6+tls
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Michele Baldessari <michele>
Component: ovn2.13Assignee: Numan Siddique <nusiddiq>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: high Docs Contact:
Priority: unspecified    
Version: FDP 20.ACC: ctrautma, jishi, kfida, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-15 13:00:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michele Baldessari 2020-06-03 13:01:28 UTC
Description of problem:

This BZ is to track the inclusion of https://mail.openvswitch.org/pipermail/ovs-dev/2020-June/371333.html downstream.

We need this to get ovndb-server working with IPv6 and TLS in OSP16.

Version:
ovn2.13-2.13.0-30

Comment 9 Jianlin Shi 2020-06-11 01:17:41 UTC
reproducer steps:

add certification:
[root@dell-per740-12 bz1843512]# cat add_ssl.sh
setenforce 0
ovs-pki init --force
pushd /etc/openvswitch

                # Create certificates and keys for northbound db and ovn-nbctl,
                # both using "controller" CA
                ovs-pki req+sign northdb controller                                                   
                ovs-pki req+sign nbctl controller                                                     

                # Create certificates and keys for southbound db and ovn-controller,                  
                # both using "switch" CA
                ovs-pki req+sign southdb switch
                ovs-pki req+sign chassis-1 switch                                                     
                ovs-pki req+sign chassis-2 switch                                                     
popd

chown -R openvswitch /etc/openvswitch
                chown -R openvswitch /var/lib/openvswitch

[root@dell-per740-42 bz1843512]# cat add_ssl.sh      
setenforce 0                                                 
ovs-pki init --force
scp root.eng.pek2.redhat.com:/etc/openvswitch/*.pem /etc/openvswitch/             
rm -rf /var/lib/openvswitch/pki/*
scp -r root.eng.pek2.redhat.com:/var/lib/openvswitch/pki/* /var/lib/openvswitch/pki/

setup pcs:
[root@dell-per740-12 bz1843512]# cat setup.sh                                                         
ip_c1=1111::25
ip_c2=1111::42
ip_v=1111::100

(sleep 2;echo "hacluster"; sleep 2; echo "redhat" ) |pcs host auth $ip_c1 $ip_c2
                        sleep 5
                        pcs cluster setup my_cluster --force --start $ip_c1 $ip_c2
                        pcs cluster enable --all
                        pcs property set stonith-enabled=false                                                                                                                                              
                        pcs property set no-quorum-policy=ignore
                        pcs cluster cib tmp-cib.xml
                        sleep 10
                        cp tmp-cib.xml tmp-cib.deltasrc

                        pcs resource delete ip-ipv6
                        pcs resource delete ovndb_servers-clone
                        sleep 5
                        pcs status
                        pcs -f tmp-cib.xml resource create ip-ipv6 ocf:heartbeat:IPaddr2 ip=$ip_v op monitor interval=30s
                        sleep 5
                        pcs -f tmp-cib.xml resource create ovndb_servers  ocf:ovn:ovndb-servers manage_northd=yes master_ip=[$ip_v] nb_master_port=6641 sb_master_port=6642 nb_master_protocol=ssl sb_master_protocol=ssl ovn_nb_db_privkey=/etc/openvswitch/northdb-privkey.pem ovn_nb_db_cert=/etc/openvswitch/northdb-cert.pem ovn_nb_db_cacert=/var/lib/openvswitch/pki/controllerca/cacert.pem ovn_sb_db_privkey=/etc/openvswitch/southdb-privkey.pem ovn_sb_db_cert=/etc/openvswitch/southdb-cert.pem ovn_sb_db_cacert=/var/lib/openvswitch/pki/switchca/cacert.pem promotable
                        sleep 5
                        pcs -f tmp-cib.xml resource meta ovndb_servers-clone notify=true
                        pcs -f tmp-cib.xml constraint order start ip-ipv6 then promote ovndb_servers-clone
                        pcs -f tmp-cib.xml constraint colocation add ip-ipv6 with master ovndb_servers-clone
                        pcs cluster cib-push tmp-cib.xml diff-against=tmp-cib.deltasrc

reproduced on ovn2.13.0-33:

[root@dell-per740-12 bz1843512]# rpm -qa | grep -E "openvswitch|ovn"                                  
ovn2.13-2.13.0-33.el8fdp.x86_64
ovn2.13-host-2.13.0-33.el8fdp.x86_64
openvswitch2.13-2.13.0-38.el8fdp.x86_64                                                               
ovn2.13-central-2.13.0-33.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch

[root@dell-per740-12 bz1843512]# pcs status
Cluster name: my_cluster                                                                              
Cluster Summary:                                                                                      
  * Stack: corosync
  * Current DC: 1111::25 (version 2.0.3-5.el8-4b1f869f0f) - partition with quorum                     
  * Last updated: Wed Jun 10 21:08:42 2020
  * Last change:  Wed Jun 10 21:05:56 2020 by root via crm_attribute on 1111::25                      
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ 1111::25 1111::42 ]

Full List of Resources:
  * ip-ipv6     (ocf::heartbeat:IPaddr2):       Started 1111::25                                      
  * Clone Set: ovndb_servers-clone [ovndb_servers] (promotable):
    * Masters: [ 1111::25 ]
    * Slaves: [ 1111::42 ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/disabled
[root@dell-per740-12 bz1843512]# ovn-sbctl list connection    

<=== no sb connection on master
                                        
[root@dell-per740-12 bz1843512]# netstat  -anp -6
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp6       0      0 :::111                  :::*                    LISTEN      1/systemd           
tcp6       0      0 :::2224                 :::*                    LISTEN      23658/platform-pyth 
tcp6       0      0 ::1:8081                :::*                    LISTEN      3916/restraintd     
tcp6       0      0 :::22                   :::*                    LISTEN      1707/sshd           
tcp6       0      0 ::1:25                  :::*                    LISTEN      12012/master        
tcp6       1      0 2620:52:0:4982:22:37258 2620:52:0:4902:505:8000 CLOSE_WAIT  3916/restraintd     
udp6       0      0 :::111                  :::*                                1/systemd           
udp6       0      0 ::1:323                 :::*                                1576/chronyd        
udp6       0      0 1111::25:5405           :::*                                80851/corosync      
raw6       0      0 :::58                   :::*                    7           22063/NetworkManage

<=== 6641 and 6642 is not listened on master

[root@dell-per740-42 bz1843512]# ovn-nbctl show
ovn-nbctl: transaction error: {"details":"insert operation not allowed when database server is in read only mode","error":"not allowed"}

<=== fail on slave

2020-06-11T01:06:04.069Z|00019|reconnect|INFO|ssl:[1111::100]:6641: connection attempt failed (Connecction refused)

<=== error in /var/log/ovn/ovsdb-server-nb.log on slave


Verified on ovn2.13.0-34:

[root@dell-per740-12 bz1843512]# rpm -qa | grep -E "openvswitch|ovn"                                  
ovn2.13-2.13.0-34.el8fdp.x86_64
ovn2.13-central-2.13.0-34.el8fdp.x86_64
openvswitch2.13-2.13.0-38.el8fdp.x86_64                                                               
openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch
ovn2.13-host-2.13.0-34.el8fdp.x86_64

[root@dell-per740-12 bz1843512]# pcs status                                                           
Cluster name: my_cluster                                                                              
Cluster Summary:
  * Stack: corosync
  * Current DC: 1111::25 (version 2.0.3-5.el8-4b1f869f0f) - partition with quorum
  * Last updated: Wed Jun 10 21:15:18 2020                                                            
  * Last change:  Wed Jun 10 21:15:17 2020 by root via crm_attribute on 1111::25                      
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ 1111::25 1111::42 ]

Full List of Resources:                                                                               
  * ip-ipv6     (ocf::heartbeat:IPaddr2):       Started 1111::25                                      
  * Clone Set: ovndb_servers-clone [ovndb_servers] (promotable):
    * Masters: [ 1111::25 ]
    * Slaves: [ 1111::42 ]

Daemon Status:                                                                                        
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/disabled
[root@dell-per740-12 bz1843512]# ovn-sbctl list connection                                            
_uuid               : 881363c8-cbd5-46ce-8720-5a18929f433e
external_ids        : {}
inactivity_probe    : 5000
is_connected        : true
max_backoff         : []
other_config        : {}
read_only           : false
role                : ""                                                                              
status              : {bound_port="6642", sec_since_connect="0", sec_since_disconnect="0"}
target              : "pssl:6642:[1111::100]"

<=== sb connection on master
[root@dell-per740-12 bz1843512]# netstat  -anp -6                                                     
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp6       0      0 :::111                  :::*                    LISTEN      1/systemd           
tcp6       0      0 :::2224                 :::*                    LISTEN      23658/platform-pyth 
tcp6       0      0 1111::100:6641          :::*                    LISTEN      85409/ovsdb-server  
tcp6       0      0 ::1:8081                :::*                    LISTEN      3916/restraintd     
tcp6       0      0 1111::100:6642          :::*                    LISTEN      85426/ovsdb-server  
tcp6       0      0 :::22                   :::*                    LISTEN      1707/sshd           
tcp6       0      0 ::1:25                  :::*                    LISTEN      12012/master        
tcp6       0      0 1111::100:6642          1111::42:35458          ESTABLISHED 85426/ovsdb-server  
tcp6       1      0 2620:52:0:4982:22:37330 2620:52:0:4902:505:8000 CLOSE_WAIT  3916/restraintd     
tcp6       0      0 1111::100:6641          1111::42:40402          ESTABLISHED 85409/ovsdb-server  
udp6       0      0 :::111                  :::*                                1/systemd           
udp6       0      0 ::1:323                 :::*                                1576/chronyd        
udp6       0      0 1111::25:5405           :::*                                85008/corosync      
raw6       0      0 :::58                   :::*                    7           22063/NetworkManage 

<=== 6641 and 6642 is listened on master
[root@dell-per740-12 bz1843512]# ovn-nbctl ls-add ls1
[root@dell-per740-12 bz1843512]# ovn-nbctl show
switch a4d137ac-ebe1-410e-a2bb-d8b55683ec57 (ls1)

[root@dell-per740-42 bz1843512]# ovn-nbctl show                                                       
switch a4d137ac-ebe1-410e-a2bb-d8b55683ec57 (ls1)

<==== db is replicated into node

Comment 11 errata-xmlrpc 2020-07-15 13:00:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2941