Bug 1982960

Summary: [4.8 Internal mode]- must-gather ceph folder no longer has cephcluster,cephobjectstore,cephobjectstoreuser collections
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Neha Berry <nberry>
Component: must-gatherAssignee: Mudit Agarwal <muagarwa>
Status: VERIFIED --- QA Contact: Neha Berry <nberry>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.8CC: muagarwa
Target Milestone: ---Keywords: AutomationBackLog, Regression
Target Release: OCS 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.8.0-456.ci Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
gather-debug.log none

Description Neha Berry 2021-07-16 06:21:45 UTC
Created attachment 1802249 [details]
gather-debug.log

Description of problem (please be detailed as possible and provide log
snippests):
=======================================================================
Checked multiple regression runs of OCS 4.8.0-444.ci and above and it seems that we are somehow not collecting the following in the ceph/namespaces/openshift-storage/ceph.rook.io/ folder:

Current collection:
============
ls -l must-gather.local.892446471755977210/quay-io-rhceph-dev-ocs-must-gather-sha256-34a308176a13725fdf66feed07da46bb952f4921411f59dba0e0de4548ee1180/ceph/namespaces/openshift-storage/ceph.rook.io/
total 0
drwxr-xr-x. 1 nberry nberry 74 Jul 16 11:00 cephblockpools
drwxr-xr-x. 1 nberry nberry 76 Jul 16 11:00 cephfilesystems

Expected collection
=====================
➜  ceph.rook.io ls -l
total 0
drwxr-xr-x. 1 nberry nberry  74 Jun 25 23:45 cephblockpools
drwxr-xr-x. 1 nberry nberry  70 Jun 25 23:45 cephclusters
drwxr-xr-x. 1 nberry nberry  76 Jun 25 23:45 cephfilesystems
drwxr-xr-x. 1 nberry nberry  78 Jun 25 23:45 cephobjectstores
drwxr-xr-x. 1 nberry nberry 152 Jun 25 23:45 cephobjectstoreusers


This seems to be a regression in recent builds of OCS 4.8 and is 100% reproducible.

Is it possible that this regression was introduced due to recent fixes  Bug 1963207 or Bug  1978663 ?

Logs from 1 tier1 (v4.8.0-452) - http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j055ai3c333-t1/j055ai3c333-t1_20210714T152410/logs/failed_testcase_ocs_logs_1626284165/test_must_gather%5bOTHERS%5d_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-416b9b01a25fed45028819a90970af0bdfdd07f3860ab8552c05ab5cea59065f/ceph/namespaces/openshift-storage/ceph.rook.io/

same issue in 	v4.8.0-444.ci - http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j001ai3c36-ua/j001ai3c36-ua_20210706T142418/logs/failed_testcase_ocs_logs_1625592415/test_must_gather%5bOTHERS%5d_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-97d8e8c06ef5588f61412a0b54a4ae1bb022da660bac9169cb9184da58d61c06/ceph/namespaces/openshift-storage/ceph.rook.io/

Version of all relevant components (if applicable):
====================================================
Seen in many OCS 4.8 builds recently , e.g.4.8.0-452.ci


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
===========================================================
No but important logs are missing

Is there any workaround available to the best of your knowledge?
============================================================
Not sure


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
=================================================================
3

Can this issue reproducible?
==============================
Yes

Can this issue reproduce from the UI?
========================================
NA

If this is a regression, please provide more details to justify this:
===========================================================
Yes. 

Steps to Reproduce:
====================
1. Install latest OCS 4.8 .e.g 4.8.0-452.ci
2. Run must-gather

oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.8|tee terminal-mg-452-encr_ci.log
3. Terminal logs say col


Actual results:
===================
Terminal output
--------------
[must-gather-j92rb] POD 2021-07-16T05:30:07.030023814Z Collecting entire ceph logs
[must-gather-j92rb] POD 2021-07-16T05:30:07.037111829Z collecting dump cephobjectstores
[must-gather-j92rb] POD 2021-07-16T05:30:07.210345070Z collecting dump cephobjectstoreusers
[must-gather-j92rb] POD 2021-07-16T05:30:07.403808319Z collecting dump cephclusters
[must-gather-j92rb] POD 2021-07-16T05:30:07.596838834Z collecting dump cephblockpools
[must-gather-j92rb] POD 2021-07-16T05:30:07.756286463Z collecting dump cephfilesystems


gather-debug log
===================
Collecting entire ceph logs
collecting dump cephblockpools
Wrote inspect data to must-gather/ceph.
collecting dump cephfilesystems
Wrote inspect data to must-gather/ceph.

The collecting dump for cephcluster, cephobjectstore and cephobjectstoreuser is absent


Expected results:
=====================
The files should be colllected



Additional info:
======================
BTW resources exist

oc get cephcluster,cephobjectstore,cephobjectstoreuser -n openshift-storage
NAME                                                      DATADIRHOSTPATH   MONCOUNT   AGE   PHASE   MESSAGE                        HEALTH      EXTERNAL
cephcluster.ceph.rook.io/ocs-storagecluster-cephcluster   /var/lib/rook     3          22h   Ready   Cluster created successfully   HEALTH_OK   

NAME                                                              AGE
cephobjectstore.ceph.rook.io/ocs-storagecluster-cephobjectstore   22h

NAME                                                                      AGE
cephobjectstoreuser.ceph.rook.io/noobaa-ceph-objectstore-user             22h
cephobjectstoreuser.ceph.rook.io/ocs-storagecluster-cephobjectstoreuser   22h

Comment 3 Mudit Agarwal 2021-07-16 14:25:02 UTC
Yeah, this is a regression because of the fix for bug #1963207 (https://github.com/openshift/ocs-operator/pull/1220)

But we have a simple workaround and I don't think this is a blocker for 4.8

Workaround is to run these 3 commands manually: 

1.  oc adm inspect cephclusters --all-namespaces
2.  oc adm inspect cephobjectstoreusers --all-namespaces
3.  oc adm inspect cephobjectstores --all-namespaces