Created attachment 1802249 [details] gather-debug.log Description of problem (please be detailed as possible and provide log snippests): ======================================================================= Checked multiple regression runs of OCS 4.8.0-444.ci and above and it seems that we are somehow not collecting the following in the ceph/namespaces/openshift-storage/ceph.rook.io/ folder: Current collection: ============ ls -l must-gather.local.892446471755977210/quay-io-rhceph-dev-ocs-must-gather-sha256-34a308176a13725fdf66feed07da46bb952f4921411f59dba0e0de4548ee1180/ceph/namespaces/openshift-storage/ceph.rook.io/ total 0 drwxr-xr-x. 1 nberry nberry 74 Jul 16 11:00 cephblockpools drwxr-xr-x. 1 nberry nberry 76 Jul 16 11:00 cephfilesystems Expected collection ===================== ➜ ceph.rook.io ls -l total 0 drwxr-xr-x. 1 nberry nberry 74 Jun 25 23:45 cephblockpools drwxr-xr-x. 1 nberry nberry 70 Jun 25 23:45 cephclusters drwxr-xr-x. 1 nberry nberry 76 Jun 25 23:45 cephfilesystems drwxr-xr-x. 1 nberry nberry 78 Jun 25 23:45 cephobjectstores drwxr-xr-x. 1 nberry nberry 152 Jun 25 23:45 cephobjectstoreusers This seems to be a regression in recent builds of OCS 4.8 and is 100% reproducible. Is it possible that this regression was introduced due to recent fixes Bug 1963207 or Bug 1978663 ? Logs from 1 tier1 (v4.8.0-452) - http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j055ai3c333-t1/j055ai3c333-t1_20210714T152410/logs/failed_testcase_ocs_logs_1626284165/test_must_gather%5bOTHERS%5d_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-416b9b01a25fed45028819a90970af0bdfdd07f3860ab8552c05ab5cea59065f/ceph/namespaces/openshift-storage/ceph.rook.io/ same issue in v4.8.0-444.ci - http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j001ai3c36-ua/j001ai3c36-ua_20210706T142418/logs/failed_testcase_ocs_logs_1625592415/test_must_gather%5bOTHERS%5d_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-97d8e8c06ef5588f61412a0b54a4ae1bb022da660bac9169cb9184da58d61c06/ceph/namespaces/openshift-storage/ceph.rook.io/ Version of all relevant components (if applicable): ==================================================== Seen in many OCS 4.8 builds recently , e.g.4.8.0-452.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? =========================================================== No but important logs are missing Is there any workaround available to the best of your knowledge? ============================================================ Not sure Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ================================================================= 3 Can this issue reproducible? ============================== Yes Can this issue reproduce from the UI? ======================================== NA If this is a regression, please provide more details to justify this: =========================================================== Yes. Steps to Reproduce: ==================== 1. Install latest OCS 4.8 .e.g 4.8.0-452.ci 2. Run must-gather oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.8|tee terminal-mg-452-encr_ci.log 3. Terminal logs say col Actual results: =================== Terminal output -------------- [must-gather-j92rb] POD 2021-07-16T05:30:07.030023814Z Collecting entire ceph logs [must-gather-j92rb] POD 2021-07-16T05:30:07.037111829Z collecting dump cephobjectstores [must-gather-j92rb] POD 2021-07-16T05:30:07.210345070Z collecting dump cephobjectstoreusers [must-gather-j92rb] POD 2021-07-16T05:30:07.403808319Z collecting dump cephclusters [must-gather-j92rb] POD 2021-07-16T05:30:07.596838834Z collecting dump cephblockpools [must-gather-j92rb] POD 2021-07-16T05:30:07.756286463Z collecting dump cephfilesystems gather-debug log =================== Collecting entire ceph logs collecting dump cephblockpools Wrote inspect data to must-gather/ceph. collecting dump cephfilesystems Wrote inspect data to must-gather/ceph. The collecting dump for cephcluster, cephobjectstore and cephobjectstoreuser is absent Expected results: ===================== The files should be colllected Additional info: ====================== BTW resources exist oc get cephcluster,cephobjectstore,cephobjectstoreuser -n openshift-storage NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL cephcluster.ceph.rook.io/ocs-storagecluster-cephcluster /var/lib/rook 3 22h Ready Cluster created successfully HEALTH_OK NAME AGE cephobjectstore.ceph.rook.io/ocs-storagecluster-cephobjectstore 22h NAME AGE cephobjectstoreuser.ceph.rook.io/noobaa-ceph-objectstore-user 22h cephobjectstoreuser.ceph.rook.io/ocs-storagecluster-cephobjectstoreuser 22h
Yeah, this is a regression because of the fix for bug #1963207 (https://github.com/openshift/ocs-operator/pull/1220) But we have a simple workaround and I don't think this is a blocker for 4.8 Workaround is to run these 3 commands manually: 1. oc adm inspect cephclusters --all-namespaces 2. oc adm inspect cephobjectstoreusers --all-namespaces 3. oc adm inspect cephobjectstores --all-namespaces