Bug 1795177

Summary:

OC status command displays - panic: runtime error: invalid memory address or nil pointer dereference

Product:

OpenShift Container Platform

Reporter:

Lakshmi Ravichandran <lakshmi.ravichandran1>

Component:

Multi-Arch

Assignee:

Carvel Baus <cbaus>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Barry Donahue <bdonahue>

Severity:

low

Docs Contact:

Priority:

low

Version:

4.2.z

CC:

cbaus, dorzel, Holger.Wolf, hwolf, nbziouec

Target Milestone:

---

Target Release:

---

Hardware:

s390x

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-06-29 17:46:05 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
SIGSEGV: segmentation error	none

Description Lakshmi Ravichandran 2020-01-27 11:23:25 UTC

Created attachment 1655647 [details]
SIGSEGV: segmentation error

Description of problem:
"oc status" command of OC CLI displays - 
panic: runtime error: invalid memory address or nil pointer dereference
[signam SIGSEGV: segmentation violation code=0x1 ...]

Version-Release number of selected component (if applicable):
Client Version: openshift-clients-4.2.2-201910250432-12-g72076900
Server Version: 4.2.12-s390x
Kubernetes Version: v1.14.6+32dc4a0

How reproducible:
Introduce 100% disk stress on one of the worker nodes in the OCP cluster using filebench command.
The OCP console stops responding and the worker node goes down.
On opening the OC CLI and executing the command "oc status" displays  
"
panic: runtime error: invalid memory address or nil pointer dereference
[signam SIGSEGV: segmentation violation code=0x1 ...]
"
rather than giving a proper error message.


Steps to Reproduce:
1. Introduce 100% disk utilization workload on one of the worker nodes
2. On observering the worker node goes to "Not ready" state, the OCP console stops responding
3. On logging to the bastion and giving oc status command gives SIGSEGV error

Actual results:
Segmenation error given by the oc status command


Expected results:
The oc status should display a proper error message for the scenario.

Additional info:
Only the worker node on which the stress was put in has went to "Not ready" state other master and worker nodes in the cluster was in the "Ready" state

Comment 1 Carvel Baus 2020-02-04 21:08:42 UTC

Can you provide some more specific information about "Introduce 100% disk utilization workload on one of the worker nodes"

filebench does not appear to be included as part of RHCOS. Also could you please provide exact command used to run filebench, including arguments.

Comment 2 Lakshmi Ravichandran 2020-02-13 12:32:10 UTC

During the bugzappers call, this bug has been discussed to follow on with (https://bugzilla.redhat.com/show_bug.cgi?id=1795185)

Comment 3 Carvel Baus 2020-06-10 19:14:13 UTC

A possible fix for this landed in the latest 4.2 nightly. Can you re-test this and see if it can be reproduced?

Comment 4 Lakshmi Ravichandran 2020-06-23 18:10:51 UTC

The scenario was tested on OCP version
Client Version: 4.4.0-0.nightly-s390x-2020-06-17-185805
Server Version: 4.4.0-0.nightly-s390x-2020-06-17-185805
Kubernetes Version: v1.17.1+912792b

and the reported behaviour was not reproducible and the fix is supposed to be landed.

can someone please help me to understand, if the scenario still has to be tested on the latest 4.2 nightly as well ? 
I suppose, since the version of CRIO/other components responsible carrying the OOM fixes is well available/advanced in 4.4.0-0.nightly-s390x-2020-06-17-185805, please correct me if am wrong.

Comment 5 Lakshmi Ravichandran 2020-06-24 16:18:38 UTC

Tested the bug scenario on OCP 4.2.34 and the reported behaviour is not observed.

oc version
-----------
Client Version: 4.4.0-0.nightly-s390x-2020-06-12-154108
Server Version: 4.2.34
Kubernetes Version: v1.14.6+20b13ba