Description of problem: While investigating upgrade failure in: https://bugzilla.redhat.com/show_bug.cgi?id=1880591#c29 I see northd and nbdb both crashed. nbdb is a segfault: 2020-09-28T07:05:42Z|00048|raft|INFO|current entry eid c412ebfa-44ea-4406-a6f8-3f0260801ab4 does not match prerequisite 0a392e2c-d534-47d0-8c65-7329dec47f6f in execute_command_request 2020-09-28T07:05:52Z|00049|raft|INFO|current entry eid 7887dfd3-7719-4a20-98c7-ea5552f70363 does not match prerequisite e50798a4-d7d1-4be2-b20b-11a8734cd445 in execute_command_request 2020-09-28T07:06:11Z|00050|raft|INFO|Dropped 4 log messages in last 20 seconds (most recently, 18 seconds ago) due to excessive rate 2020-09-28T07:06:11Z|00051|raft|INFO|current entry eid d9154f99-15c3-401b-9b1b-cdc006f15357 does not match prerequisite 438c7302-f9ba-492e-b1b1-c657f4c446a0 in execute_command_request 2020-09-28T07:08:08Z|00001|fatal_signal(log_fsync0)|WARN|terminating with signal 15 (Terminated) 2020-09-28T07:08:08Z|00001|fatal_signal(urcu2)|WARN|terminating with signal 15 (Terminated) 2020-09-28T07:08:08Z|00052|fatal_signal|WARN|terminating with signal 11 (Segmentation fault) The only thing unique to this upgrade was QE was trying a patch I wrote to remove certain reject ACLs, which was flawed and will just end up in an error from ovn-nbctl: E0928 14:12:52.224724 1 loadbalancer.go:291] Error while removing ACL: 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443, from switches, error: OVN command '/usr/bin/ovn-nbctl --timeout=15 -- --if-exists remove logical_switch 51f5ee00-ae2b-48a9-92f3-7beb877b5106 acl 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443 -- --if-exists remove logical_switch e9c07de0-65c5-47e3-ad1e-072a4ecc0bdb acl 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443 -- --if-exists remove logical_switch e9319c12-33b5-49d0-bea7-eb41fcc3bd0a acl 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443 -- --if-exists remove logical_switch 7ac93672-2b9a-4221-9ee9-1d9ad0fccfbb acl 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443 -- --if-exists remove logical_switch 8af3f2d8-3a11-48c7-9015-d8501bad79b4 acl 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443 -- --if-exists remove logical_switch 4650f909-f5cf-4465-97e5-01d52b9f5469 acl 866090f8-397f-4446-9b51-5d0e0f07a118-172.30.229.223\:443' failed: exit status 1 [root@ip-10-0-74-3 ~]# rpm -qa |grep ovn ovn2.13-central-20.06.2-11.el8fdp.x86_64 ovn2.13-vtep-20.06.2-11.el8fdp.x86_64 ovn2.13-20.06.2-11.el8fdp.x86_64 ovn2.13-host-20.06.2-11.el8fdp.x86_64 Will attach dbs and coredumps. If nbdb crash is different from northd let me know and I'll open a separate bug.
Created attachment 1717279 [details] ovsdb server core dump
Created attachment 1717280 [details] northd core dump
Created attachment 1717281 [details] ovn dbs
I'm not seeing this anymore, and it looks like the segfault was caused by the termination. I'm not sure when it happened during upgrade due to the log rotation. I'd suggest we close this for now and if we see it again reopen.
(In reply to Tim Rozet from comment #5) > I'm not seeing this anymore, and it looks like the segfault was caused by > the termination. I'm not sure when it happened during upgrade due to the log > rotation. I'd suggest we close this for now and if we see it again reopen. Thanks for the update. Closing the BZ for now.