Description of problem: Tested SDN migrated to OVN, then rollback to SDN, after that observe some coredump files existing in the nodes. Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2020-11-23-195308 How reproducible: Steps to Reproduce: 1. Migrate SDN to OVN 2. Rollback OVN to SDN 3. Check on the each nodes Actual results: for f in $(oc get nodes -o jsonpath='{.items[*].metadata.name}') ; do oc debug node/"${f}" -- chroot /host coredumpctl list; done Starting pod/ip-10-0-129-82us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` No coredumps found. Removing debug pod ... error: non-zero exit code from debug container Starting pod/ip-10-0-136-69us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` TIME PID UID GID SIG COREFILE EXE Tue 2020-11-24 02:33:20 UTC 2845 0 0 11 present /usr/bin/ovn-northd Removing debug pod ... Starting pod/ip-10-0-177-88us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` No coredumps found. Removing debug pod ... Starting pod/ip-10-0-185-53us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` TIME PID UID GID SIG COREFILE EXE Tue 2020-11-24 02:33:20 UTC 3118 0 0 11 present /usr/bin/ovn-northd Removing debug pod ... Starting pod/ip-10-0-200-132us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` TIME PID UID GID SIG COREFILE EXE Tue 2020-11-24 02:33:19 UTC 2916 0 0 11 present /usr/bin/ovn-northd Removing debug pod ... Starting pod/ip-10-0-217-222us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` No coredumps found. Removing debug pod ... error: non-zero exit code from debug container huiran-mac:script hrwang$ for f in $(oc get nodes -o jsonpath='{.items[*].metadata.name}') ; do oc debug node/"${f}" -- chroot /host ls -lrt /var/lib/systemd/coredump/; done Starting pod/ip-10-0-129-82us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` total 0 Removing debug pod ... Starting pod/ip-10-0-136-69us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` total 4688 -rw-r-----. 1 root root 4796288 Nov 24 02:33 core.ovn-northd.0.c2009f9ffdc446faa9fa612b95fcc157.2845.1606185198000000.lz4 Removing debug pod ... Starting pod/ip-10-0-177-88us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` total 0 Removing debug pod ... Starting pod/ip-10-0-185-53us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` total 4496 -rw-r-----. 1 root root 4598629 Nov 24 02:33 core.ovn-northd.0.a2bce18de9434f7f9dc5dd065d2fc972.3118.1606185198000000.lz4 Removing debug pod ... Starting pod/ip-10-0-200-132us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` total 4740 -rw-r-----. 1 root root 4849631 Nov 24 02:33 core.ovn-northd.0.1958581f4eb046a68e0144a5a13ba9e2.2916.1606185198000000.lz4 Removing debug pod ... Starting pod/ip-10-0-217-222us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` total 0 Removing debug pod ... Expected results: No coredump files Additional info: Found this issue when run some regression after OVN rollback to SDN, the failed case logs:http://ci-qe-openshift.usersys.redhat.com/userContent/cucushift/v3/2020/11/23/12:44:00/Should_not_break_the_cluster_when_creating_network_policy_with_incorrect_json_structure/console.html Then manually reproduced it after OVN rollback to SDN. Checked on fresh SDN and fresh OVN, did not found this issue.
Created attachment 1732831 [details] coredump file
Assigning the bug to the OVN team so that they can have a look at the coredump of northd and figure out if it's normal given the rollback from OVN -> SDN. @OVN team: OCP 4.7.0-0.nightly-2020-11-23-195308 runs OVN ovn2.13-20.09.0-7.el8fdn.x86_64, but I am not sure which version that corresponds to according to your schema. So excuse me if it's incorrect
*** This bug has been marked as a duplicate of bug 1957030 ***