Bug 1889441

Summary: Traceback error message while running OCS 4.6 must-gather
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Neha Berry <nberry>
Component: must-gatherAssignee: Mudit Agarwal <muagarwa>
Status: CLOSED ERRATA QA Contact: Persona non grata <nobody+410372>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.6CC: ebenahar, muagarwa, ocs-bugs, pkundra, sabose
Target Milestone: ---   
Target Release: OCS 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.6.0-142.ci Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-17 06:24:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
must-gather terminal output
none
must gather cmd output none

Description Neha Berry 2020-10-19 16:43:08 UTC
Created attachment 1722659 [details]
must-gather terminal output

Description of problem (please be detailed as possible and provide log
snippests):
---------------------------------------------------------------
For internal mode clusters, while collecting ceph command outputs, a Traceback error message as shown below is observed

$ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6 

[must-gather-ddw2b] POD collecting snapshot info for ceph rbd volumes 
[must-gather-ddw2b] POD collecting snapshot info for ceph subvolumes 
[must-gather-ddw2b] POD bash: jq: command not found
[must-gather-ddw2b] POD Traceback (most recent call last):
[must-gather-ddw2b] POD   File "/usr/bin/ceph", line 1269, in <module>
[must-gather-ddw2b] POD     retval = main()
[must-gather-ddw2b] POD   File "/usr/bin/ceph", line 1254, in main
[must-gather-ddw2b] POD     sys.stdout.flush()
[must-gather-ddw2b] POD BrokenPipeError: [Errno 32] Broken pipe
[must-gather-ddw2b] POD Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
[must-gather-ddw2b] POD BrokenPipeError: [Errno 32] Broken pipe
[must-gather-ddw2b] POD command terminated with exit code 127


Version of all relevant components (if applicable):
-------------------------------------------------------
Tested on 2 separate clusters

ocs-operator.v4.6.0-137.ci and ocs-operator.v4.6.0-131.ci

OCP - 4.6.0-0.nightly-2020-10-14-095718

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
---------------------------------------------------------------------
no

Is there any workaround available to the best of your knowledge?
------------------------------------
NA

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
--------------------------------
2

Can this issue reproducible?
---------------------------
Yes. tested twice

Can this issue reproduce from the UI?
-------------------------------------
NA

If this is a regression, please provide more details to justify this:
----------------------------------------------------------
Not sure


Steps to Reproduce:
--------------------------

1. Install OCS 4.6
2. Run OCS 4.6 must-gather on Internal CLuster
$ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6

3.Check the log collection in the terminal and look for above traceback while ceph command output is getting collected


Actual results:
------------------
Traceback error seen

Expected results:
--------------------
No error message should appear


Additional info:
-----------------------

Not sure if it is indeed a bug, but @pulkit request you to take a look once.. to confirm

Comment 2 Neha Berry 2020-10-19 16:45:18 UTC
Logs copied here - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bug-1889441/

@pkundra can you confirm if it is a bug ? I could not find the same error in the debug log

Comment 3 Mudit Agarwal 2020-10-20 03:33:35 UTC
This might be due to the changes I did for collecting snapshot info. I will give a dev ack once I confirm the same.

>> [must-gather-ddw2b] POD bash: jq: command not found

In the cluster I used for testing, "jq" command was working.

Neha, I have the cluster where this is reproducible?

Comment 5 Neha Berry 2020-10-21 11:43:26 UTC
(In reply to Mudit Agarwal from comment #3)
> This might be due to the changes I did for collecting snapshot info. I will
> give a dev ack once I confirm the same.
> 
> >> [must-gather-ddw2b] POD bash: jq: command not found
> 
> In the cluster I used for testing, "jq" command was working.
> 
> Neha, I have the cluster where this is reproducible?

JQ is installed on my laptop too

$ jq
jq - commandline JSON processor [version 1.6]



BTW, I tried on 2 separate internal mode clusters and saw this message. But could it be due to some missing binary on my laptop ?

I tried on a different cluster and still see this issue

[must-gather-zz76p] POD bash: jq: command not found
[must-gather-zz76p] POD Traceback (most recent call last):
[must-gather-zz76p] POD   File "/usr/bin/ceph", line 1269, in <module>
[must-gather-zz76p] POD     retval = main()
[must-gather-zz76p] POD   File "/usr/bin/ceph", line 1254, in main
[must-gather-zz76p] POD     sys.stdout.flush()
[must-gather-zz76p] POD BrokenPipeError: [Errno 32] Broken pipe
[must-gather-zz76p] POD Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
[must-gather-zz76p] POD BrokenPipeError: [Errno 32] Broken pipe
[must-gather-zz76p] POD command terminated with exit code 127

I will DM you the cluster details anyways.

Comment 6 Mudit Agarwal 2020-10-22 13:08:00 UTC
Thanks for the cluster Neha, I have a fix for this.

Comment 10 Persona non grata 2020-10-27 10:29:27 UTC
Created attachment 1724519 [details]
must gather cmd output

Comment 11 Persona non grata 2020-10-27 10:45:18 UTC
Tried must gather command on versions ocs-operator.v4.6.0-144 with 4.6.0-0.nightly-2020-10-27-011248.
I dont see any traceback errors in the must gather logs. Moving the  bug to verified.

Comment 14 errata-xmlrpc 2020-12-17 06:24:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5605