The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1992705 - FFWD2 OSP13->OSP16.2 stopped working for ovn deployments
Summary: FFWD2 OSP13->OSP16.2 stopped working for ovn deployments
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: FDP 21.F
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: OVN Team
QA Contact: Jianlin Shi
URL:
Whiteboard:
: 2016343 (view as bug list)
Depends On:
Blocks: 1983106 1993154 2006744 2009741 2019446
TreeView+ depends on / blocked
 
Reported: 2021-08-11 15:34 UTC by Lukas Bezdicka
Modified: 2021-11-04 23:27 UTC (History)
12 users (show)

Fixed In Version: ovn-2021-21.06.0-24.el8fdp ovn-2021-21.09.0-5.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2006744 (view as bug list)
Environment:
Last Closed: 2021-09-07 18:03:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1484 0 None None None 2021-08-11 15:38:40 UTC
Red Hat Product Errata RHBA-2021:3450 0 None None None 2021-09-07 18:03:50 UTC

Description Lukas Bezdicka 2021-08-11 15:34:30 UTC
Description of problem:
OVN CI jobs started to fail - https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/

Root cause is that during the upgrade controller gets to latest version where OVN container is running on host with OVS2.13 but the computes need "hybrid" mode where only OVN container is switched to the latest version and the host stays the same with OVS2.11. This worked well up until new version of OVN.


Error in logs:
2021-08-11T14:44:54.158Z|00038|reconnect|INFO|tcp:172.17.1.46:6642: continuing to reconnect in the background but suppressing further logging                                                                                                
2021-08-11T14:46:26.727Z|00039|fatal_signal|WARN|terminating with signal 15 (Terminated)
2021-08-11T14:46:31.546Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovn-controller.log
2021-08-11T14:46:31.548Z|00002|reconnect|INFO|unix:/run/openvswitch/db.sock: connecting...
2021-08-11T14:46:31.548Z|00003|reconnect|INFO|unix:/run/openvswitch/db.sock: connected
2021-08-11T14:46:31.550Z|00004|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?)
2021-08-11T14:46:31.550Z|00005|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?)                                                                                                   
2021-08-11T14:46:31.550Z|00006|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?)
2021-08-11T14:46:31.551Z|00007|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?)                                                                                                   
2021-08-11T14:46:31.553Z|00008|main|INFO|OVN internal version is : [21.06.0-20.17.0-56.0]
2021-08-11T14:46:31.553Z|00009|main|INFO|OVS IDL reconnected, force recompute.
2021-08-11T14:46:31.553Z|00010|reconnect|INFO|tcp:172.17.1.46:6642: connecting...
2021-08-11T14:46:31.553Z|00011|main|INFO|OVNSB IDL reconnected, force recompute.
2021-08-11T14:46:31.553Z|00012|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}                                                                        
2021-08-11T14:46:31.554Z|00013|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}                                                                        
2021-08-11T14:46:31.554Z|00014|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}                                                                        
2021-08-11T14:46:31.555Z|00015|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}                                                                        
2021-08-11T14:46:31.555Z|00016|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}                                                                        
2021-08-11T14:46:32.215Z|00017|reconnect|INFO|tcp:172.17.1.60:6642: connecting...
2021-08-11T14:46:33.215Z|00018|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt timed out
2021-08-11T14:46:34.215Z|00019|reconnect|INFO|tcp:172.17.1.60:6642: connecting...
2021-08-11T14:46:35.215Z|00020|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt timed out
2021-08-11T14:46:35.215Z|00021|reconnect|INFO|tcp:172.17.1.60:6642: waiting 2 seconds before reconnect
2021-08-11T14:46:37.215Z|00022|reconnect|INFO|tcp:172.17.1.60:6642: connecting...
2021-08-11T14:46:39.057Z|00023|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt failed (No route to host)
2021-08-11T14:46:39.057Z|00024|reconnect|INFO|tcp:172.17.1.60:6642: waiting 4 seconds before reconnect
2021-08-11T14:46:41.545Z|00025|memory|INFO|4488 kB peak resident set size after 10.0 seconds
2021-08-11T14:46:43.057Z|00026|reconnect|INFO|tcp:172.17.1.60:6642: connecting...
2021-08-11T14:46:45.069Z|00027|reconnect|INFO|tcp:172.17.1.60:6642: connection attempt failed (No route to host)
2021-08-11T14:46:45.069Z|00028|reconnect|INFO|tcp:172.17.1.60:6642: continuing to reconnect in the background but suppressing further logging
2021-08-11T14:47:31.553Z|00029|ovsdb_idl|WARN|Dropped 269083 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate
2021-08-11T14:47:31.553Z|00030|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}
2021-08-11T14:48:31.553Z|00031|ovsdb_idl|WARN|Dropped 271389 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate
2021-08-11T14:48:31.553Z|00032|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}
2021-08-11T14:49:31.553Z|00033|ovsdb_idl|WARN|Dropped 266958 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate
2021-08-11T14:49:31.553Z|00034|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}
2021-08-11T14:50:31.553Z|00035|ovsdb_idl|WARN|Dropped 270291 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate
2021-08-11T14:50:31.553Z|00036|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}
2021-08-11T14:51:31.553Z|00037|ovsdb_idl|WARN|Dropped 267861 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate
2021-08-11T14:51:31.553Z|00038|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"datapaths\"]"}



Version-Release number of selected component (if applicable):

docker ps | grep ovn_controller
c9aa688033bf        undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-controller:16.2_20210804.1   "dumb-init --singl..."   45 minutes ago      Up 45 minutes                                   ovn_controller

[root@compute-0 ~]# docker exec -u root -it ovn_controller rpm -qa | grep openv
network-scripts-openvswitch2.15-2.15.0-26.el8fdp.x86_64
openvswitch2.15-2.15.0-26.el8fdp.x86_64
python3-openvswitch2.15-2.15.0-26.el8fdp.x86_64
rhosp-network-scripts-openvswitch-2.15-4.el8ost.1.noarch
openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
rhosp-openvswitch-2.15-4.el8ost.1.noarch
python3-rhosp-openvswitch-2.15-4.el8ost.1.noarch
[root@compute-0 ~]# docker exec -u root -it ovn_controller rpm -qa | grep ovn
ovn-2021-21.06.0-17.el8fdp.x86_64
rhosp-ovn-2021-4.el8ost.1.noarch
ovn-2021-host-21.06.0-17.el8fdp.x86_64
rhosp-ovn-host-2021-4.el8ost.1.noarch

[root@compute-0 ~]# rpm -qa | grep openvswitch
rhosp-openvswitch-2.11-0.7.el7ost.noarch
rhosp-openvswitch-ovn-central-2.11-0.7.el7ost.noarch
python-openvswitch2.11-2.11.3-86.el7fdp.x86_64
rhosp-openvswitch-ovn-host-2.11-0.7.el7ost.noarch
openvswitch-selinux-extra-policy-1.0-17.el7fdp.noarch
python-rhosp-openvswitch-2.11-0.7.el7ost.noarch
openstack-neutron-openvswitch-12.1.1-42.el7ost.noarch
openvswitch2.11-2.11.3-86.el7fdp.x86_64

Comment 1 Jakub Libosvar 2021-08-11 15:37:28 UTC
The impact is OVN can't talk to the local ovsdb and e.g. create patch ports between br-int and provider bridges.

Comment 2 Mark Michelson 2021-08-18 15:49:01 UTC
FYI, it appears that the "datapaths" column was added in August 2019 to OVS, which means it would be present in OVS 2.12+. In general, the OVN team recommends using whatever version of OVS is contemporary with the current version of OVN. When it comes to versions of OVN from the 20.XX series onward (ovn2.13 and ovn-2021 in RHEL), those should be compatible with OVS 2.13+. In this case, trying to upgrade OVN but using OVS 2.11 is going to cause this issue. OVS needs to be at 2.13+.

Comment 3 Dan Williams 2021-08-18 17:11:08 UTC
Question for OSP: does 21.06.0-13 work?

A bunch of the datapath related code in 21.06 was added to 21.06.0-14 for "ovn-controller: Detect OVS datapath capabilities."

Comment 4 Mark Michelson 2021-08-18 18:06:29 UTC
@Dan, I don't think that will help in this case. The problem here is a schema mismatch, and that causes errors upon connection when ovn-controller sends its "monitor_cond" (or "monitor_cond_since", I'm not sure which exactly) request. ovn-controller is interested in these columns that do not exist in the current version of OVS, and there's not a way for ovn-controller to fall back to requesting monitoring based on an older schema.

Comment 5 Mark Michelson 2021-08-30 18:52:33 UTC
Changing issue to MODIFIED. I was wrong with my reply to Dan earlier. It turns out the probing for capabilities relied on a database table that was added during the OVS 2.12 cycle. As such, this exposed the incompatibility with OVS 2.11. The offending commits have been reverted so as to provide a build. A longer-term solution will be created to allow for us to check datapath capabilities without relying on assumptions about the schema.

Comment 8 Jianlin Shi 2021-09-01 08:07:34 UTC
reproduce steps:

1. install openvswitch2.11-2.11.3-86.el8fdp and ovn-2021-21.06.0-17.el8fdp on client
2. install openvswitch2.15-2.15.0-26.el8fdp and ovn-2021-21.06.0-17.el8fdp on server
3. start ovn on server:
systemctl start openvswitch
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.25
systemctl restart ovn-controller                                                                      
                                                                                                      
ovn-nbctl ls-add ls1                                                                                  
ovn-nbctl lsp-add ls1 ls1p1                                                                           
ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:01 192.168.1.1"                                     
                                                                                                      
ovn-nbctl lsp-add ls1 ls1p2                                                                           
ovn-nbctl lsp-set-addresses ls1p2 "00:00:00:01:01:02 192.168.1.2"                                     
                                                                                                      
ovs-vsctl add-port br-int ls1p1 -- set interface ls1p1 type=internal external_ids:iface-id=ls1p1      
                                                                                                      
ip netns add ls1p1                                                                                    
ip link set ls1p1 netns ls1p1                                                                         
ip netns exec ls1p1 ip link set ls1p1 address 00:00:00:01:01:01                                       
ip netns exec ls1p1 ip link set ls1p1 up                                                              
ip netns exec ls1p1 ip addr add 192.168.1.1/24 dev ls1p1

4. start ovn on client
systemctl start openvswitch                                                                           
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641                                                                    
ovn-sbctl set-connection ptcp:6642                                                                    
ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.26
systemctl restart ovn-controller                                                                      
ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15
                                                                                                      
ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2      

ip netns add ls1p2
ip link set ls1p2 netns ls1p2
ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02
ip netns exec ls1p2 ip link set ls1p2 up                                                              
ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2

result on client:

+ ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15                                         
ovs-vsctl: no row "br-int" in table Bridge
+ ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2    
ovs-vsctl: no bridge named br-int                                                                     
+ ip netns add ls1p2                                                                                  
+ ip link set ls1p2 netns ls1p2                                                                       
Cannot find device "ls1p2"                                                                            
+ ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02                                     
Cannot find device "ls1p2"                                                                            
+ ip netns exec ls1p2 ip link set ls1p2 up                                                            
Cannot find device "ls1p2"                                                                            
+ ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2                                            
Cannot find device "ls1p2"

[root@wsfd-advnetlab17 bz1992705]# grep WARN /var/log/ovn/ovn-controller.log
2021-09-01T08:04:24.020Z|00004|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?)
2021-09-01T08:04:24.020Z|00005|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?)
2021-09-01T08:04:24.020Z|00006|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?)
2021-09-01T08:04:24.020Z|00007|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?)
2021-09-01T08:04:24.024Z|00013|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-09-01T08:04:24.024Z|00014|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-09-01T08:04:24.026Z|00015|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-09-01T08:04:24.027Z|00016|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-09-01T08:04:24.027Z|00017|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-09-01T08:04:24.030Z|00020|rconn|WARN|unix:/run/openvswitch/br-int.mgmt: connection failed (No such file or directory)

[root@wsfd-advnetlab17 bz1992705]# rpm -qa | grep -E "openvswitch2.11|ovn-2021"
ovn-2021-21.06.0-17.el8fdp.x86_64
openvswitch2.11-2.11.3-86.el8fdp.x86_64
ovn-2021-central-21.06.0-17.el8fdp.x86_64
ovn-2021-host-21.06.0-17.el8fdp.x86_64

Verified on ovn-2021-21.06.0-24:

[root@wsfd-advnetlab17 bz1992705]# bash -x client.sh                                                  
+ systemctl start openvswitch
+ systemctl start ovn-northd
+ ovn-nbctl set-connection ptcp:6641
+ ovn-sbctl set-connection ptcp:6642                                                                  
+ ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.26
+ systemctl restart ovn-controller
+ ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15                                         
+ ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2    
+ ip netns add ls1p2                                                                                  
+ ip link set ls1p2 netns ls1p2                                                                       
+ ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02                                     
+ ip netns exec ls1p2 ip link set ls1p2 up                                                            
+ ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2 

[root@wsfd-advnetlab17 bz1992705]# rpm -qa | grep -E "openvswitch2.11|ovn-2021"                       
ovn-2021-central-21.06.0-24.el8fdp.x86_64                                                             
openvswitch2.11-2.11.3-86.el8fdp.x86_64                                                               
ovn-2021-21.06.0-24.el8fdp.x86_64                                                                     
ovn-2021-host-21.06.0-24.el8fdp.x86_64

[root@wsfd-advnetlab16 bz1992705]# ovn-sbctl show                                                     
Chassis hv1                                                                                           
    hostname: wsfd-advnetlab16.anl.lab.eng.bos.redhat.com
    Encap geneve                                                                                      
        ip: "20.0.175.25"                                                                             
        options: {csum="true"}                                                                        
    Port_Binding ls1p1                                                                                
Chassis hv0                                                                                           
    hostname: wsfd-advnetlab17.anl.lab.eng.bos.redhat.com
    Encap geneve                                                                                      
        ip: "20.0.175.26"                                                                             
        options: {csum="true"}                                                                        
    Port_Binding ls1p2                                                                                
[root@wsfd-advnetlab16 bz1992705]# ip netns exec ls1p1 ping 192.168.1.2 -c 1
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=2.21 ms
                                                                                                      
--- 192.168.1.2 ping statistics ---                                                                   
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.205/2.205/2.205/0.000 ms

Comment 10 errata-xmlrpc 2021-09-07 18:03:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3450

Comment 14 Mark Michelson 2021-10-29 17:13:12 UTC
*** Bug 2016343 has been marked as a duplicate of this bug. ***

Comment 15 Lukas Bezdicka 2021-11-04 23:27:42 UTC
1518 packets transmitted, 1518 received, 0% packet loss, time 1547210ms
ovn-2021-21.09.0-5.el8fdp.x86_64


Note You need to log in before you can comment on or make changes to this bug.