1801070 – must-gather(latest-4.2) throws warning messages and doesn't collect all relevant info & logs

Bug 1801070 - must-gather(latest-4.2) throws warning messages and doesn't collect all relevant info & logs

Summary: must-gather(latest-4.2) throws warning messages and doesn't collect all relev...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	must-gather
Sub Component:
Version:	4.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ashish Ranjan
QA Contact:	Raz Tamir
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-02-10 07:58 UTC by Neha Berry
Modified:	2020-02-18 09:57 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-02-18 09:57:42 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Neha Berry 2020-02-10 07:58:55 UTC

Description of problem (please be detailed as possible and provide log
snippests):
---------------------------------------------------------------------------
ocs-must gather with latest-4.2 tag throws some warning messages during log collection 

>> $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.2

Using image: quay.io/rhceph-dev/ocs-must-gather:latest-4.2
namespace/openshift-must-gather-k4db7 created
clusterrolebinding.rbac.authorization.k8s.io/must-gather-hk8qg created
Collecting operator pod logs
Collection dump of storageclusters
WARNING: openshift-must-gather has been DEPRECATED. Use `oc adm inspect` instead.
Error: unknown flag: --base-dir
See 'oc adm inspect --help' for usage.
Collection dump of objectbucketclaims
WARNING: openshift-must-gather has been DEPRECATED. Use `oc adm inspect` instead.
Error: unknown flag: --base-dir
See 'oc adm inspect --help' for usage.


>>Issues:

1. Too many warning messages as above while running the must-gather

2. Seems like some commands have changed in OCP 4.3, hence better to align the commands with OCP version

3. Even though must-gather passed, most of the relevant logs and command outputs were not collected. Full output pasted in Additional Info

Setup:

1. Platform: VMware VMFS
2. OS: RHCOS
3. CPU=4, Memory=16GB (this was not a production based cluster)
4. OCS version : 4.2.1

Version of all relevant components (if applicable):
---------------------------------------------------------------------------
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0     True        False         73m     Cluster version is 4.3.0


Must-gather image : "quay.io/rhceph-dev/ocs-must-gather:latest-4.2"

ocs-operator.v4.2.1   OpenShift Container Storage   4.2.1 


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
---------------------------------------------------------------------------
Unable to collect logs, which does impact reporting new issues.

Is there any workaround available to the best of your knowledge?
---------------------------------------------------------------------------
Not sure

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
---------------------------------------------------------------------------
3

Can this issue reproducible?
---------------------------------------------------------------------------
Yes

Can this issue reproduce from the UI?
---------------------------------------------------------------------------
No

If this is a regression, please provide more details to justify this:
---------------------------------------------------------------------------
Yes

Steps to Reproduce:
---------------------------------------------------------------------------
1. Create an OCS cluster in VMware ( jenkins run - https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/4419/console) 
2. Once setup is created, collect OCS must-gather.
 $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.2
3.


Actual results:
---------------------------------------------------------------------------
must-gather commands throws multiple warning and all logs are not collected:

$ ls -ltrh  must-gather.local.3305830168128168859/ceph/namespaces/openshift-storage/
total 12K
drwxr-xr-x. 5 nberry nberry 4.0K Feb 10 12:44 osd_prepare_volume_logs
drwxr-xr-x. 6 nberry nberry 4.0K Feb 10 12:44 pods
drwxr-xr-x. 3 nberry nberry 4.0K Feb 10 12:44 must_gather_commands
[nberry@localhost ocs-logs]$ 



Expected results:
---------------------------------------------------------------------------

Originally logs used to collect the following:

#/home/nberry/aws-install/dec9-1/logs/must-gather.local.1804442920985571948/ceph/namespaces/openshift-storage
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 apps
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 apps.openshift.io
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 autoscaling
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 batch
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 build.openshift.io
drwxr-xr-x.  5 nberry nberry 4096 Dec 10 13:01 ceph.rook.io
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 core
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:01 image.openshift.io
drwxr-xr-x.  3 nberry nberry 4096 Dec 10 13:01 must_gather_commands
-rwxr-xr-x.  1 nberry nberry  511 Dec 10 12:59 openshift-storage.yaml
drwxr-xr-x.  3 nberry nberry 4096 Dec 10 13:01 operators.coreos.com
drwxr-xr-x.  5 nberry nberry 4096 Dec 10 13:01 osd_prepare_volume_logs
drwxr-xr-x. 38 nberry nberry 4096 Dec 10 13:01 pods
drwxr-xr-x.  2 nberry nberry 4096 Dec 10 13:02 route.openshift.io

Comment 3 Neha Berry 2020-02-10 08:04:08 UTC

Logs: http://rhsqe-repo.lab.eng.blr.redhat.com/cns/ocs-qe-bugs/bz-1801070.zip

Comment 5 Ashish Ranjan 2020-02-14 08:21:40 UTC

I had a chat with Boris Regarding this and found out that the downstream build was broken for sometime due to incorrect Dockerfile update. Can you retry with the latest 4.2 image ?

Note You need to log in before you can comment on or make changes to this bug.