The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1900938 - [Migration] Observe some ovn coredump files existing in the nodes
Summary: [Migration] Observe some ovn coredump files existing in the nodes
Keywords:
Status: CLOSED DUPLICATE of bug 1957030
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 20.H
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: OVN Team
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-24 03:41 UTC by huirwang
Modified: 2021-10-29 14:05 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-29 13:58:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
coredump file (5.64 MB, application/gzip)
2020-11-24 05:54 UTC, huirwang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-963 0 None None None 2021-10-29 14:05:03 UTC

Description huirwang 2020-11-24 03:41:44 UTC
Description of problem:
Tested SDN migrated to OVN, then rollback to SDN, after that observe some coredump files existing in the nodes.

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-11-23-195308
How reproducible:


Steps to Reproduce:
1. Migrate SDN to OVN
2. Rollback OVN to SDN
3. Check on the each nodes

Actual results:
for f in $(oc get nodes  -o jsonpath='{.items[*].metadata.name}') ; do oc debug node/"${f}" --  chroot /host coredumpctl list; done
Starting pod/ip-10-0-129-82us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
No coredumps found.

Removing debug pod ...
error: non-zero exit code from debug container
Starting pod/ip-10-0-136-69us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
TIME                            PID   UID   GID SIG COREFILE  EXE
Tue 2020-11-24 02:33:20 UTC    2845     0     0  11 present   /usr/bin/ovn-northd

Removing debug pod ...
Starting pod/ip-10-0-177-88us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
No coredumps found.

Removing debug pod ...
Starting pod/ip-10-0-185-53us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
TIME                            PID   UID   GID SIG COREFILE  EXE
Tue 2020-11-24 02:33:20 UTC    3118     0     0  11 present   /usr/bin/ovn-northd

Removing debug pod ...
Starting pod/ip-10-0-200-132us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
TIME                            PID   UID   GID SIG COREFILE  EXE
Tue 2020-11-24 02:33:19 UTC    2916     0     0  11 present   /usr/bin/ovn-northd

Removing debug pod ...
Starting pod/ip-10-0-217-222us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
No coredumps found.

Removing debug pod ...
error: non-zero exit code from debug container
huiran-mac:script hrwang$ for f in $(oc get nodes  -o jsonpath='{.items[*].metadata.name}') ; do oc debug node/"${f}" --  chroot /host ls -lrt  /var/lib/systemd/coredump/; done
Starting pod/ip-10-0-129-82us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
total 0

Removing debug pod ...
Starting pod/ip-10-0-136-69us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
total 4688
-rw-r-----. 1 root root 4796288 Nov 24 02:33 core.ovn-northd.0.c2009f9ffdc446faa9fa612b95fcc157.2845.1606185198000000.lz4

Removing debug pod ...
Starting pod/ip-10-0-177-88us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
total 0

Removing debug pod ...
Starting pod/ip-10-0-185-53us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
total 4496
-rw-r-----. 1 root root 4598629 Nov 24 02:33 core.ovn-northd.0.a2bce18de9434f7f9dc5dd065d2fc972.3118.1606185198000000.lz4

Removing debug pod ...
Starting pod/ip-10-0-200-132us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
total 4740
-rw-r-----. 1 root root 4849631 Nov 24 02:33 core.ovn-northd.0.1958581f4eb046a68e0144a5a13ba9e2.2916.1606185198000000.lz4

Removing debug pod ...
Starting pod/ip-10-0-217-222us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
total 0

Removing debug pod ...

Expected results:
No coredump files 

Additional info:
Found this issue when run some regression after OVN rollback to SDN, the failed case logs:http://ci-qe-openshift.usersys.redhat.com/userContent/cucushift/v3/2020/11/23/12:44:00/Should_not_break_the_cluster_when_creating_network_policy_with_incorrect_json_structure/console.html
Then manually reproduced it after OVN rollback to SDN.

Checked on fresh SDN and fresh OVN, did not found this issue.

Comment 1 huirwang 2020-11-24 05:54:43 UTC
Created attachment 1732831 [details]
coredump file

Comment 2 Alexander Constantinescu 2020-11-30 17:03:01 UTC
Assigning the bug to the  OVN team so that they can have a look at the coredump of northd and figure out if it's normal given the rollback from OVN -> SDN. 

@OVN team: 

OCP 4.7.0-0.nightly-2020-11-23-195308 runs OVN ovn2.13-20.09.0-7.el8fdn.x86_64, but I am not sure which version that corresponds to according to your schema. So excuse me if it's incorrect

Comment 3 Mark Michelson 2021-10-29 13:58:27 UTC

*** This bug has been marked as a duplicate of bug 1957030 ***


Note You need to log in before you can comment on or make changes to this bug.