The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2006744 - OSP13->OSP16.1 stopped working for ovn deployments
Summary: OSP13->OSP16.1 stopped working for ovn deployments
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 21.H
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Mohammad Heib
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On: 1992705
Blocks: 2019451
TreeView+ depends on / blocked
 
Reported: 2021-09-22 10:24 UTC by Lukas Bezdicka
Modified: 2023-06-16 07:16 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1992705
Environment:
Last Closed: 2023-03-13 07:06:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1557 0 None None None 2021-09-22 10:26:48 UTC

Comment 2 Karrar Fida 2021-10-29 18:55:27 UTC
@mheib do you know when you will be able to do the backport?

Comment 3 Karrar Fida 2021-10-29 18:56:32 UTC
we need to get the backport asap because the FDP team will need to test the updated OVN 2.13 build and we only have next week to do so.

Comment 4 Mohammad Heib 2021-10-31 16:53:55 UTC
@kfida sorry for the late response, i have completed the backporting of the commit above an submitted my change to ovn2.13 gerrit repo for review.

unfortunately,  i don't know how i can test it with osp16.1 so i created a build with my change and i will appreciate if you or @lbezdick can install my build in the link below and test it.

build link:
A yum repository for the build of ovn2.13-20.12.0-187.el8fdp (task 40715836) is available at:

http://brew-task-repos.usersys.redhat.com/repos/official/ovn2.13/20.12.0/187.el8fdp/

You can install the rpms locally by putting this .repo file in your /etc/yum.repos.d/ directory:

http://brew-task-repos.usersys.redhat.com/repos/official/ovn2.13/20.12.0/187.el8fdp/ovn2.13-20.12.0-187.el8fdp.repo

RPMs and build logs can be found in the following locations:
http://brew-task-repos.usersys.redhat.com/repos/official/ovn2.13/20.12.0/187.el8fdp/aarch64/
http://brew-task-repos.usersys.redhat.com/repos/official/ovn2.13/20.12.0/187.el8fdp/x86_64/
http://brew-task-repos.usersys.redhat.com/repos/official/ovn2.13/20.12.0/187.el8fdp/s390x/
http://brew-task-repos.usersys.redhat.com/repos/official/ovn2.13/20.12.0/187.el8fdp/ppc64le/

thanks,

Comment 7 Karrar Fida 2021-11-02 13:42:04 UTC
@lbezdick any update on the testing of ovn2.13-20.12.0-187.el8fdp

Comment 8 Lukas Bezdicka 2021-11-02 16:57:16 UTC
I concluded that jobs failed for this reason:
ovn_metadata_agent starts while ovn_db is not reachable yet on controller as it's being redeployed.
The agent instead of trying to reconnect just crashes with error which triggers docker to restart it.
Docker starts restarting the agent with adding double the time between attempts.
By the time the service is available on controller node the docker on compute restarts the service in 30min.
This causes the CI failure as the metadata agent is not available for 30min while we check for it.


Issue described above has nothing to do with the ovn issue which I consider now as solved.

Comment 9 Jianlin Shi 2021-11-03 04:12:32 UTC
Tested with script in https://bugzilla.redhat.com/show_bug.cgi?id=1992705#c8:

reproduced on ovn-2.13-20.12.0-178.el8:

[root@wsfd-advnetlab20 bz2006744]# bash -x rep.sh                                                   
+ systemctl start openvswitch                                                                            
+ ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.175.26
+ systemctl restart ovn-controller                                                                  
+ ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15                                                                                                        
ovs-vsctl: no row "br-int" in table Bridge                                                                                                                                                                 
+ ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2    
ovs-vsctl: no bridge named br-int                                                                      
+ ip netns add ls1p2                                                                                  
+ ip link set ls1p2 netns ls1p2                                                                     
Cannot find device "ls1p2"                                                                            
+ ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02                                     
Cannot find device "ls1p2"                                                                            
+ ip netns exec ls1p2 ip link set ls1p2 up                                                             
Cannot find device "ls1p2"                                                                           
+ ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2                                            
Cannot find device "ls1p2"

[root@wsfd-advnetlab20 bz2006744]# grep WARN /var/log/ovn/ovn-controller.log                         
2021-11-03T03:58:50.582Z|00004|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?)
2021-11-03T03:58:50.582Z|00005|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?)
2021-11-03T03:58:50.583Z|00006|ovsdb_idl|WARN|Open_vSwitch database lacks Datapath table (database needs upgrade?)
2021-11-03T03:58:50.583Z|00007|ovsdb_idl|WARN|Open_vSwitch table in Open_vSwitch database lacks datapaths column (database needs upgrade?)
2021-11-03T03:58:50.586Z|00012|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-11-03T03:58:50.586Z|00013|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-11-03T03:58:50.586Z|00014|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-11-03T03:58:50.587Z|00015|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}
2021-11-03T03:58:50.587Z|00016|ovsdb_idl|WARN|transaction error: {"details":"datapaths is not a valid column name","error":"syntax error","syntax":"[\"bridges\",\"datapaths\"]"}

Verified on ovn2.13-20.12.0-187.el8:

[root@wsfd-advnetlab20 bz2006744]# rpm -qa | grep -E "openvswitch|ovn"
openvswitch2.11-2.11.3-86.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch                                                 
ovn2.13-20.12.0-187.el8fdp.x86_64                                                                     
ovn2.13-host-20.12.0-187.el8fdp.x86_64
ovn2.13-central-20.12.0-187.el8fdp.x86_64

[root@wsfd-advnetlab20 bz2006744]# bash -x rep.sh                                                     
+ systemctl start openvswitch                                                                         
+ ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:1.1.182.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=1.1.182.26                                      
+ systemctl restart ovn-controller                                                                    
+ ovs-vsctl set bridge br-int protocols=OpenFlow13,OpenFlow15                                         
+ ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2
+ ip netns add ls1p2                                                                                  
+ ip link set ls1p2 netns ls1p2                                                                       
+ ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02                                     
+ ip netns exec ls1p2 ip link set ls1p2 up                                                            
+ ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2                                            
[root@wsfd-advnetlab20 bz2006744]# ip netns exec ls1p2 ping 192.168.1.1 -c 1                          
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.                                                  
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.259 ms

--- 192.168.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.259/0.259/0.259/0.000 ms


Note You need to log in before you can comment on or make changes to this bug.