Description of problem (please be detailed as possible and provide log snippests): The collection of kernel and journal logs which was added recently in must-gather is incomplete.The must-gather doesn't collect logs of all the worker nodes. Actual results: [đŠī¸]mrajanna@fedora rook $]oc get nodes > NAME STATUS ROLES AGE VERSION > compute-0 Ready worker 8d v1.25.4+77bec7a > compute-1 Ready worker 8d v1.25.4+77bec7a > compute-2 Ready worker 8d v1.25.4+77bec7a > compute-3 Ready worker 8d v1.25.4+77bec7a > compute-4 Ready worker 8d v1.25.4+77bec7a > compute-5 Ready worker 8d v1.25.4+77bec7a [DIR] journal_compute-0/ 2023-01-16 00:39 - [DIR] journal_compute-3/ 2023-01-16 00:39 - [DIR] journal_compute-5/ 2023-01-16 00:40 - [DIR] kernel_compute-0/ 2023-01-16 00:40 - [DIR] kernel_compute-3/ 2023-01-16 00:40 - [DIR] kernel_compute-5/ 2023-01-16 00:40 - Expected results: Logs from all the nodes must be collected. Additional info: Refer https://bugzilla.redhat.com/show_bug.cgi?id=2160034#c24 for more details.
Moved it back, automated query
PR was never backported to 4.13, do not move it to ON_QA until there is a build available with the fix.
Bug Fixed: OCP Version:4.13.0-0.nightly-2023-04-21-084440 ODF Version: odf-operator.v4.13.0-172.stable Platform:Vsphere Test Procedure: 1.Collect mg: oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.13 2.Verify kernel and journal logs from all worker nodes exist: $ oc get nodes NAME STATUS ROLES AGE VERSION compute-0 Ready worker 26h v1.26.3+379cd9f compute-1 Ready worker 26h v1.26.3+379cd9f compute-2 Ready worker 26h v1.26.3+379cd9f control-plane-0 Ready control-plane,master 26h v1.26.3+379cd9f control-plane-1 Ready control-plane,master 26h v1.26.3+379cd9f control-plane-2 Ready control-plane,master 26h v1.26.3+379cd9f oviner:quay-io-rhceph-dev-ocs-must-gather-sha256-79f522ddb035becf5878305c4af24de6d83610b42e849505b5159ab20b8bb5fa$ find -name "*kernel_*" ./ceph/kernel_compute-0 ./ceph/kernel_compute-0/kernel_compute-0.gz ./ceph/kernel_compute-1 ./ceph/kernel_compute-1/kernel_compute-1.gz ./ceph/kernel_compute-2 ./ceph/kernel_compute-2/kernel_compute-2.gz oviner:quay-io-rhceph-dev-ocs-must-gather-sha256-79f522ddb035becf5878305c4af24de6d83610b42e849505b5159ab20b8bb5fa$ find -name "*journal_*" ./ceph/journal_compute-0 ./ceph/journal_compute-0/journal_compute-0.gz ./ceph/journal_compute-1 ./ceph/journal_compute-1/journal_compute-1.gz ./ceph/journal_compute-2 ./ceph/journal_compute-2/journal_compute-2.gz
Hi Ilya, In this setup, we collected journal and kernel logs on all worker nodes. SetUP: ODF4.13, OCP4.13, VSPHERE UPI. OCS MG DIR: https://url.corp.redhat.com/53c1088 JOB Link: https://url.corp.redhat.com/48355c2 @ypadia Do we need to collect this logs on worker nodes without ocs label [cluster.ocs.openshift.io/openshift-storage]? compute-5 Ready worker 62m v1.26.2+22308ca 10.1.112.80 10.1.112.80 Red Hat Enterprise Linux CoreOS 413.92.202304101935-0 (Plow) 5.14.0-284.10.1.el9_2.x86_64 cri-o://1.26.3-2.rhaos4.13.gitafec31f.el9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-5,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=data-2
Only journal_compute-4 and kernel_compute-4 logs were collected on setup with 6 worker nodes and 3 worker nodes with ocs-lable Test Process: 1. Deploy OCP cluster with 6 worker nodes 2. Install ODF opertor 3. Install storage cluster [enable ocs on 3 worker nodes] $ oc get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS compute-0 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack1 compute-1 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack0 compute-2 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-2,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack2 compute-3 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-3,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos compute-4 Ready worker 22h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-4,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos compute-5 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-5,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos control-plane-0 Ready control-plane,master 22h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=control-plane-0,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,node.openshift.io/os_id=rhcos control-plane-1 Ready control-plane,master 22h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=control-plane-1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,node.openshift.io/os_id=rhcos control-plane-2 Ready control-plane,master 22h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=control-plane-2,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,node.openshift.io/os_id=rhcos oviner:ClusterPath$ oc get nodes --show-labels | grep ocs compute-0 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack1 compute-1 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack0 compute-2 Ready worker 21h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-2,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack2 4.Collect OCS MG: $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.13 5.Check the content of MG DIR: oviner:ceph$ ls -la total 8 drwxr-xr-x. 1 oviner oviner 250 May 17 14:36 . drwxrwxrwx. 1 oviner oviner 172 May 17 14:36 .. -rw-r--r--. 1 oviner oviner 3336 May 17 14:27 event-filter.html drwxr-xr-x. 1 oviner oviner 40 May 17 14:36 journal_compute-4 drwxr-xr-x. 1 oviner oviner 38 May 17 14:36 kernel_compute-4 drwxr-xr-x. 1 oviner oviner 666 May 17 14:36 logs drwxr-xr-x. 1 oviner oviner 2378 May 17 14:36 must_gather_commands drwxr-xr-x. 1 oviner oviner 4254 May 17 14:36 must_gather_commands_json_output drwxr-xr-x. 1 oviner oviner 34 May 17 14:36 namespaces -rw-r--r--. 1 oviner oviner 768 May 17 14:27 timestamp
OCS MG: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2161937-update.tar.gz
Fixed: 1.Deploy cluster with 6 woker nodes 2.Label 3 worker nodes with OCS label 3.Collect MG oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.13 4.Verify journal and kernel logs collected for all 6 worker nodes [with OCS label and without OCS label] $ oc get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS compute-0 Ready worker 7h51m v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack0 compute-1 Ready worker 7h50m v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack1 compute-2 Ready worker 7h51m v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-2,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos,topology.rook.io/rack=rack2 compute-3 Ready worker 7h52m v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-3,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos compute-4 Ready worker 7h53m v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-4,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos compute-5 Ready worker 7h51m v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=compute-5,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=vsphere-vm.cpu-16.mem-64gb.os-unknown,node.openshift.io/os_id=rhcos control-plane-0 Ready control-plane,master 8h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=control-plane-0,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,node.openshift.io/os_id=rhcos control-plane-1 Ready control-plane,master 8h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=control-plane-1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,node.openshift.io/os_id=rhcos control-plane-2 Ready control-plane,master 8h v1.26.3+b404935 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=control-plane-2,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown,node.openshift.io/os_id=rhcos $ pwd /home/oviner/ClusterPath/must-gather.local.4357408987461476974/quay-io-rhceph-dev-ocs-must-gather-sha256-10071ddc29383af01d60eadfa4d6f2bd631cfd4c06fcdf7efdb655a84b13a4f1/ceph oviner:ceph$ ls -l total 8 drwxr-xr-x. 1 oviner oviner 6 May 25 18:47 ceph_daemon_log_compute-0 drwxr-xr-x. 1 oviner oviner 6 May 25 18:47 ceph_daemon_log_compute-1 drwxr-xr-x. 1 oviner oviner 6 May 25 18:47 ceph_daemon_log_compute-2 -rw-r--r--. 1 oviner oviner 3336 May 25 18:46 event-filter.html drwxr-xr-x. 1 oviner oviner 40 May 25 18:47 journal_compute-0 drwxr-xr-x. 1 oviner oviner 40 May 25 18:47 journal_compute-1 drwxr-xr-x. 1 oviner oviner 40 May 25 18:47 journal_compute-2 drwxr-xr-x. 1 oviner oviner 40 May 25 18:47 journal_compute-3 drwxr-xr-x. 1 oviner oviner 40 May 25 18:47 journal_compute-4 drwxr-xr-x. 1 oviner oviner 40 May 25 18:47 journal_compute-5 drwxr-xr-x. 1 oviner oviner 38 May 25 18:47 kernel_compute-0 drwxr-xr-x. 1 oviner oviner 38 May 25 18:47 kernel_compute-1 drwxr-xr-x. 1 oviner oviner 38 May 25 18:47 kernel_compute-2 drwxr-xr-x. 1 oviner oviner 38 May 25 18:47 kernel_compute-3 drwxr-xr-x. 1 oviner oviner 38 May 25 18:47 kernel_compute-4 drwxr-xr-x. 1 oviner oviner 38 May 25 18:47 kernel_compute-5 drwxr-xr-x. 1 oviner oviner 666 May 25 18:47 logs drwxr-xr-x. 1 oviner oviner 2648 May 25 18:47 must_gather_commands drwxr-xr-x. 1 oviner oviner 4254 May 25 18:47 must_gather_commands_json_output drwxr-xr-x. 1 oviner oviner 34 May 25 18:47 namespaces -rw-r--r--. 1 oviner oviner 770 May 25 18:46 timestamp
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:3742
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days