Bug 1877428
| Summary: | Update Must Gather to pull raw core dumps off nodes | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Andrew Stoycos <astoycos> |
| Component: | Networking | Assignee: | Andrew Stoycos <astoycos> |
| Networking sub component: | ovn-kubernetes | QA Contact: | Ross Brattain <rbrattai> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | anusaxen, jtanenba |
| Version: | 4.6 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-09-22 17:16:50 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1887446 | ||
|
Description
Andrew Stoycos
2020-09-09 15:28:17 UTC
@Ross, can you help verifying this? Thanks Forgot to mention that until 1728135 is fixed, the oc debug image might not always be available on disconnected clusters. See https://bugzilla.redhat.com/show_bug.cgi?id=1728135 For QE automation we hack workaround using oc debug --image with the multus image, since that image always exists on every node. oc debug node/"${NODE}" --image=$(oc get pod -n openshift-multus -l app=multus -o jsonpath='{.items[0].spec.containers[?(@.name=="kube-multus")].image}') @Ross, Can you help verifying it? Maybe this https://github.com/openshift/oc/blob/bfd07f8816d45f76181412fe67c919dcfba5d55f/pkg/cli/debug/debug.go#L380 generateName := names.SimpleNameGenerator.GenerateName("openshift-debug-node-") so looks like we create a temp namespace in some cases, and the sed is matching the `namespace/.*` instead of `pod/.*` Ah ok that's definitely unexpected behavior... I wish we just labeled this pod somehow, it would make getting the pod name much less hackish. In those instances where we make a tmp namespace do we also still make a pod? I could ensure we match on `pod/.*' rather than just '/' but it still might run into issues. Verified on 4.6.0-0.nightly-2020-09-22-073212 [must-gather-5nqlx] POD WARNING: Collecting network logs on ALL linux nodes in your cluster. This could take a long time. [must-gather-5nqlx] POD INFO: Waiting for node core dump collection to complete ... [must-gather-5nqlx] POD pod/qe46g23-s8c8l-master-1copenshift-qeinternal-debug condition met [must-gather-5nqlx] POD pod/qe46g23-s8c8l-worker-c-8h6mgcopenshift-qeinternal-debug condition met [must-gather-5nqlx] POD Copying core dumps on node qe46g23-s8c8l-worker-c-8h6mg.c.openshift-qe.internal [must-gather-5nqlx] POD Copying core dumps on node qe46g23-s8c8l-master-1.c.openshift-qe.internal [must-gather-5nqlx] POD pod/qe46g23-s8c8l-worker-a-ssmq9copenshift-qeinternal-debug condition met [must-gather-5nqlx] POD Copying core dumps on node qe46g23-s8c8l-worker-a-ssmq9.c.openshift-qe.internal [must-gather-5nqlx] POD pod/qe46g23-s8c8l-master-2copenshift-qeinternal-debug condition met [must-gather-5nqlx] POD Copying core dumps on node qe46g23-s8c8l-master-2.c.openshift-qe.internal [must-gather-5nqlx] POD pod/qe46g23-s8c8l-worker-b-mwtsbcopenshift-qeinternal-debug condition met [must-gather-5nqlx] POD Copying core dumps on node qe46g23-s8c8l-worker-b-mwtsb.c.openshift-qe.internal [must-gather-5nqlx] POD pod/qe46g23-s8c8l-master-0copenshift-qeinternal-debug condition met [must-gather-5nqlx] POD Copying core dumps on node qe46g23-s8c8l-master-0.c.openshift-qe.internal [must-gather-5nqlx] POD pod "qe46g23-s8c8l-worker-c-8h6mgcopenshift-qeinternal-debug" deleted [must-gather-5nqlx] POD pod "qe46g23-s8c8l-worker-a-ssmq9copenshift-qeinternal-debug" deleted [must-gather-5nqlx] POD pod "qe46g23-s8c8l-worker-b-mwtsbcopenshift-qeinternal-debug" deleted [must-gather-5nqlx] POD pod "qe46g23-s8c8l-master-1copenshift-qeinternal-debug" deleted [must-gather-5nqlx] POD pod "qe46g23-s8c8l-master-2copenshift-qeinternal-debug" deleted [must-gather-5nqlx] POD pod "qe46g23-s8c8l-master-0copenshift-qeinternal-debug" deleted [must-gather-5nqlx] POD INFO: Node core dump collection to complete. [must-gather-5nqlx] OUT waiting for gather to complete [must-gather-5nqlx] OUT downloading gather output [must-gather-5nqlx] OUT receiving incremental file list [must-gather-5nqlx] OUT ./ [must-gather-5nqlx] OUT node_core_dumps/ [must-gather-5nqlx] OUT node_core_dumps/qe46g23-s8c8l-master-0.c.openshift-qe.internal_core_dump/ [must-gather-5nqlx] OUT node_core_dumps/qe46g23-s8c8l-master-0.c.openshift-qe.internal_core_dump/core.1234 [must-gather-5nqlx] OUT node_core_dumps/qe46g23-s8c8l-master-1.c.openshift-qe.internal_core_dump/ |