Bug 1929978
| Summary: | [OVN] ovn-controller crashing with "failed in flood_remove_flows_for_sb_uuid()" | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Numan Siddique <nusiddiq> |
| Component: | ovn2.13 | Assignee: | Numan Siddique <nusiddiq> |
| Status: | CLOSED ERRATA | QA Contact: | Jianlin Shi <jishi> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | FDP 20.H | CC: | ctrautma, dsedgmen, ffernand, jishi, mflusche, nusiddiq, pmannidi, ralongi, rkhan |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1928012 | Environment: | |
| Last Closed: | 2021-03-15 14:34:36 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1928012 | ||
| Bug Blocks: | |||
|
Comment 2
Numan Siddique
2021-02-22 07:28:06 UTC
tested with following script:
systemctl start openvswitch
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.173.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.173.25
systemctl restart ovn-controller
ovs-vsctl \
-- add-port br-int vif1 \
-- set Interface vif1 type=internal external_ids:iface-id=sw0-p1 \
ofport-request=1
ovs-vsctl set open . external_ids:ovn-monitor-all=true
ovn-nbctl ls-add sw0
ovn-nbctl pg-add pg1
ovn-nbctl pg-add pg2
ovn-nbctl lsp-add sw0 sw0-p2
ovn-nbctl lsp-set-addresses sw0-p2 "00:00:00:00:00:02 192.168.47.2"
ovn-nbctl lsp-add sw0 sw0-p3
ovn-nbctl lsp-set-addresses sw0-p3 "00:00:00:00:00:03 192.168.47.3"
# Pause ovn-northd. When it is resumed, all the below NB updates
# will be sent in one transaction.
ovn-appctl -t ovn-northd pause
ovn-nbctl lsp-add sw0 sw0-p1
ovn-nbctl lsp-set-addresses sw0-p1 "00:00:00:00:00:01 192.168.47.1"
ovn-nbctl pg-set-ports pg1 sw0-p1 sw0-p2
ovn-nbctl pg-set-ports pg2 sw0-p3
ovn-nbctl acl-add pg1 to-lport 1002 "outport == @pg1 && ip4 && ip4.src == \$pg2_ip4 && udp && udp.dst >= 1 && udp.dst <= 65535" allow-related
# resume ovn-northd now. This should result in a single update message
# from SB ovsdb-server to ovn-controller for all the above NB updates.
ovn-appctl -t ovn-northd resume
sleep 5
ovn-nbctl --wait=hv pg-set-ports pg1 sw0-p1 sw0-p2 sw0-p3
reproduced on 20.12.0-20:
[root@wsfd-advnetlab21 bz1929978]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
ovn2.13-central-20.12.0-20.el8fdp.x86_64
openvswitch2.13-2.13.0-95.el8fdp.x86_64
ovn2.13-20.12.0-20.el8fdp.x86_64
ovn2.13-host-20.12.0-20.el8fdp.x86_64
[root@wsfd-advnetlab21 bz1929978]# coredumpctl info
PID: 189852 (ovn-controller)
UID: 991 (openvswitch)
GID: 989 (openvswitch)
Signal: 6 (ABRT)
Timestamp: Tue 2021-02-23 21:03:17 EST (1min 31s ago)
Command Line: ovn-controller unix:/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfi>
Executable: /usr/bin/ovn-controller
Control Group: /system.slice/ovn-controller.service
Unit: ovn-controller.service
Slice: system.slice
Boot ID: 63e0e8f704cf4baeae3173198acebc75
Machine ID: 532695c076ab4d7696e8a30b5934d994
Hostname: wsfd-advnetlab21.anl.lab.eng.bos.redhat.com
Storage: /var/lib/systemd/coredump/core.ovn-controller.991.63e0e8f704cf4baeae3173198ac>
Message: Process 189852 (ovn-controller) of user 991 dumped core.
Stack trace of thread 189852:
#0 0x00007fb4a630e7ff raise (libc.so.6)
#1 0x00007fb4a62f8c35 abort (libc.so.6)
#2 0x000055e7bd513654 ovs_abort_valist (ovn-controller)
#3 0x000055e7bd51b444 vlog_abort_valist (ovn-controller)
#4 0x000055e7bd51b4ea vlog_abort (ovn-controller)
#5 0x000055e7bd51336b ovs_assert_failure (ovn-controller)
#6 0x000055e7bd43fa22 flood_remove_flows_for_sb_uuid (ovn-controller)
#7 0x000055e7bd43fe42 ofctrl_flood_remove_flows (ovn-controller)
#8 0x000055e7bd43aaa4 lflow_handle_changed_ref (ovn-controller)
#9 0x000055e7bd457978 _flow_output_resource_ref_handler (ovn-controller)
#10 0x000055e7bd470853 engine_run (ovn-controller)
#11 0x000055e7bd42d26c main (ovn-controller)
#12 0x00007fb4a62fa7b3 __libc_start_main (libc.so.6)
#13 0x000055e7bd42e94e _start (ovn-controller)
[root@wsfd-advnetlab21 bz1929978]# grep EMER /var/log/ovn/ovn-controller.log
2021-02-24T02:03:17.122Z|00017|util|EMER|controller/ofctrl.c:1199: assertion ovs_list_is_empty(&f->list_node) failed in flood_remove_flows_for_sb_uuid()
Verified on 20.12.0-23:
[root@wsfd-advnetlab21 bz1929978]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
ovn2.13-host-20.12.0-23.el8fdp.x86_64
ovn2.13-central-20.12.0-23.el8fdp.x86_64
openvswitch2.13-2.13.0-95.el8fdp.x86_64
ovn2.13-20.12.0-23.el8fdp.x86_64
[root@wsfd-advnetlab21 bz1929978]# coredumpctl list
No coredumps found.
[root@wsfd-advnetlab21 bz1929978]# grep EMER /var/log/ovn/ovn-controller.log
<=== ovn-controller didn't crash
also no crash on 20.13.0-24 with reproducer in comment 5. set VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0839 |