The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1878139 - [upstream][ovn-controller] Assertion failure when trying to uninstall logical flow
Summary: [upstream][ovn-controller] Assertion failure when trying to uninstall logical...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: FDP 20.E
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: OVN Team
QA Contact: Ehsan Elahi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-11 13:04 UTC by Dumitru Ceara
Modified: 2021-05-19 01:38 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-18 15:49:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
core dump (1.25 MB, application/gzip)
2020-09-11 13:04 UTC, Dumitru Ceara
no flags Details
NB/SB/conf databases. (124.01 KB, application/gzip)
2020-09-11 13:11 UTC, Dumitru Ceara
no flags Details

Description Dumitru Ceara 2020-09-11 13:04:58 UTC
Created attachment 1714556 [details]
core dump

Description of problem:

With the recent addition of incremental processing for OVS flow installation (ofctrl.c) a bug was introduced that causes ovn-controller to abort when an assertion fails while removing a logical flow.

This BZ is to track the upstream work to fix the issue.

#0  0x00007f90031cd625 in raise () from /lib64/libc.so.6
#1  0x00007f90031b68d9 in abort () from /lib64/libc.so.6
#2  0x00005607d7416b84 in ovs_abort_valist (err_no=err_no@entry=0, format=format@entry=0x5607d74f6b70 "%s: assertion %s failed in %s()", args=args@entry=0x7ffeccadb7a0) at lib/util.c:419
#3  0x00005607d741e965 in vlog_abort_valist (module_=<optimized out>, message=0x5607d74f6b70 "%s: assertion %s failed in %s()", args=args@entry=0x7ffeccadb7a0) at lib/vlog.c:1249
#4  0x00005607d741ea0a in vlog_abort (module=module@entry=0x5607d75b0cc0 <this_module>, message=message@entry=0x5607d74f6b70 "%s: assertion %s failed in %s()") at lib/vlog.c:1263
#5  0x00005607d741689b in ovs_assert_failure (where=where@entry=0x5607d74d2244 "controller/ofctrl.c:1108", function=function@entry=0x5607d74d29e0 <__func__.33006> "flood_remove_flows_for_sb_uuid", 
    condition=condition@entry=0x5607d74d27f0 "ovs_list_is_empty(&f->list_node)") at lib/util.c:86
#6  0x00005607d73502fa in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x5607d8c4b380, sb_uuid=sb_uuid@entry=0x5607d8f91430, flood_remove_nodes=flood_remove_nodes@entry=0x7ffeccadb9c0) at controller/ofctrl.c:1135
#7  0x00005607d73503f2 in ofctrl_flood_remove_flows (flow_table=0x5607d8c4b380, flood_remove_nodes=flood_remove_nodes@entry=0x7ffeccadb9c0) at controller/ofctrl.c:1160
#8  0x00005607d734a8ea in lflow_handle_changed_flows (l_ctx_in=<optimized out>, l_ctx_out=0x7ffeccadba50) at controller/lflow.c:467
#9  0x00005607d7364375 in flow_output_sb_logical_flow_handler (node=0x7ffeccae0f90, data=0x5607d8c4b380) at controller/ovn-controller.c:1865
#10 0x00005607d737c933 in engine_compute (recompute_allowed=<optimized out>, node=<optimized out>) at lib/inc-proc-eng.c:306
#11 engine_run_node (recompute_allowed=<optimized out>, node=0x7ffeccae0f90) at lib/inc-proc-eng.c:352
#12 engine_run (recompute_allowed=<optimized out>) at lib/inc-proc-eng.c:377
#13 0x00005607d733fbcf in main (argc=<optimized out>, argv=<optimized out>) at controller/ovn-controller.c:2546

Version-Release number of selected component (if applicable):
The code that introduced this is only available in upstream:

OVS revision: 5198e8a06928e3324e6fd11f6209c336611dffd2
OVN revision: 520189bf313054702f5f802acd7944cca3b6baaa

Steps to load the core dump:

docker run --detach --name ovn-bug --rm fedora:31 sleep infinity
docker cp ovn-controller-6.core.1698.ovn-master.gz ovn-bug:/
docker exec -it ovn-bug bash
# in the container
dnf install -y gcc gdb git make libtool autoconf rpm-build
dnf install -y checkpolicy desktop-file-utils gcc-c++ graphviz groff libcap-ng-devel openssl-devel procps-ng python3-devel selinux-policy-devel systemd-units unbound unbound-devel python3-sphinx
git clone https://github.com/openvswitch/ovs
pushd ovs
git checkout 5198e8a06928e3324e6fd11f6209c336611dffd2
./boot.sh && ./configure && make rpm-fedora
popd
git clone https://github.com/ovn-org/ovn
pushd ovn
git checkout 520189bf313054702f5f802acd7944cca3b6baaa
./boot.sh && ./configure --with-ovs-source=/ovs && make rpm-fedora
popd
dnf localinstall -y /ovs/rpm/rpmbuild/RPMS/x86_64/openvswitch-debuginfo-2.14.90-1.fc31.x86_64.rpm /ovs/rpm/rpmbuild/RPMS/noarch/openvswitch-selinux-policy-2.14.90-1.fc31.noarch.rpm /ovs/rpm/rpmbuild/RPMS/x86_64/openvswitch-2.14.90-1.fc31.x86_64.rpm /ovn/rpm/rpmbuild/RPMS/x86_64/ovn-20.06.90-1.fc31.x86_64.rpm /ovn/rpm/rpmbuild/RPMS/x86_64/ovn-debuginfo-20.06.90-1.fc31.x86_64.rpm /ovn/rpm/rpmbuild/RPMS/x86_64/ovn-host-20.06.90-1.fc31.x86_64.rpm /ovn/rpm/rpmbuild/RPMS/x86_64/ovn-host-debuginfo-20.06.90-1.fc31.x86_64.rpm
gunzip ovn-controller-6.core.1698.ovn-master.gz

gdb /usr/bin/ovn-controller ovn-controller-6.core.1698.ovn-master

Also attaching the NB/SB databases and the conf.db of the node where ovn-controller crashed.

Comment 1 Dumitru Ceara 2020-09-11 13:11:36 UTC
Created attachment 1714559 [details]
NB/SB/conf databases.

Comment 2 Dumitru Ceara 2020-09-14 13:03:59 UTC
Simpler way to replicate the bug:


make sandbox
ovn-nbctl ls-add ls
ovn-nbctl lsp-add ls vm1
ovn-nbctl acl-add lsp-set-addresses vm1 "0a:58:fc:09:1d:4e fd00:10:244:1::5"
ovn-nbctl lsp-set-addresses vm1 "0a:58:fc:09:1d:4e fd00:10:244:1::5"
ovn-nbctl lsp-set-port-security vm1 "0a:58:fc:09:1d:4e fd00:10:244:1::5"
ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal -- set interface vm1 external-ids:iface-id=vm1
sleep 1
ovs-vsctl set interface vm1 external-ids:iface-id=foo
sleep 1
ovs-vsctl set interface vm1 external-ids:iface-id=vm1
sleep 1
ovs-vsctl set interface vm1 external-ids:iface-id=foo

Comment 3 Dumitru Ceara 2020-09-18 15:49:50 UTC
Fixed upstream by: https://github.com/ovn-org/ovn/commit/5b0c2dc286770663656befb8080b309869845c4a


Note You need to log in before you can comment on or make changes to this bug.