Description of problem (please be detailed as possible and provide log snippests): Must Gather, Ceph files do not exist on MG directory Version of all relevant components (if applicable): OCP Version:4.10.0-0.nightly-2021-12-23-153012 ODF Version:full_version=4.10.0-50 Platform: Vmware ceph versions: sh-4.4$ ceph versions { "mon": { "ceph version 16.2.7-8.el8cp (342facd49bf8e908c5105a56bf7e7e6041643258) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.7-8.el8cp (342facd49bf8e908c5105a56bf7e7e6041643258) pacific (stable)": 1 }, "osd": { "ceph version 16.2.7-8.el8cp (342facd49bf8e908c5105a56bf7e7e6041643258) pacific (stable)": 3 }, "mds": { "ceph version 16.2.7-8.el8cp (342facd49bf8e908c5105a56bf7e7e6041643258) pacific (stable)": 2 }, "rgw": { "ceph version 16.2.7-8.el8cp (342facd49bf8e908c5105a56bf7e7e6041643258) pacific (stable)": 1 }, "overall": { "ceph version 16.2.7-8.el8cp (342facd49bf8e908c5105a56bf7e7e6041643258) pacific (stable)": 10 } } Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1.Run mg command: oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.10 2.Check content on mg dir: E Exception: Files don't exist: E ['ceph_auth_list_--format_json-pretty', 'ceph_balancer_pool_ls_--format_json-pretty', 'ceph_balancer_status_--format_json-pretty', 'ceph_config-key_ls_--format_json-pretty', 'ceph_config_dump_--format_json-pretty', 'ceph_crash_ls_--format_json-pretty', 'ceph_crash_stat_--format_json-pretty', 'ceph_device_ls_--format_json-pretty', 'ceph_fs_dump_--format_json-pretty', 'ceph_fs_ls_--format_json-pretty', 'ceph_fs_status_--format_json-pretty', 'ceph_fs_subvolumegroup_ls_ocs-storagecluster-cephfilesystem_--format_json-pretty', 'ceph_health_detail_--format_json-pretty', 'ceph_mds_stat_--format_json-pretty', 'ceph_mgr_dump_--format_json-pretty', 'ceph_mgr_module_ls_--format_json-pretty', 'ceph_mgr_services_--format_json-pretty', 'ceph_mon_dump_--format_json-pretty', 'ceph_mon_stat_--format_json-pretty', 'ceph_osd_blacklist_ls_--format_json-pretty', 'ceph_osd_blocked-by_--format_json-pretty', 'ceph_osd_crush_class_ls_--format_json-pretty', 'ceph_osd_crush_dump_--format_json-pretty', 'ceph_osd_crush_rule_dump_--format_json-pretty', 'ceph_osd_crush_rule_ls_--format_json-pretty', 'ceph_osd_crush_show-tunables_--format_json-pretty', 'ceph_osd_crush_weight-set_dump_--format_json-pretty', 'ceph_osd_crush_weight-set_ls_--format_json-pretty', 'ceph_osd_df_--format_json-pretty', 'ceph_osd_df_tree_--format_json-pretty', 'ceph_osd_dump_--format_json-pretty', 'ceph_osd_getmaxosd_--format_json-pretty', 'ceph_osd_lspools_--format_json-pretty', 'ceph_osd_numa-status_--format_json-pretty', 'ceph_osd_perf_--format_json-pretty', 'ceph_osd_pool_ls_detail_--format_json-pretty', 'ceph_osd_stat_--format_json-pretty', 'ceph_osd_tree_--format_json-pretty', 'ceph_osd_utilization_--format_json-pretty', 'ceph_pg_dump_--format_json-pretty', 'ceph_pg_stat_--format_json-pretty', 'ceph_progress_--format_json-pretty', 'ceph_progress_json', 'ceph_progress_json_--format_json-pretty', 'ceph_quorum_status_--format_json-pretty', 'ceph_report_--format_json-pretty', 'ceph_service_dump_--format_json-pretty', 'ceph_status_--format_json-pretty', 'ceph_time-sync-status_--format_json-pretty', 'ceph_versions_--format_json-pretty', 'ceph_df_detail_--format_json-pretty'] E Exception: Files don't exist: E ['ceph-volume_raw_list', 'ceph_auth_list', 'ceph_balancer_status', 'ceph_config-key_ls', 'ceph_config_dump', 'ceph_crash_stat', 'ceph_device_ls', 'ceph_fs_dump', 'ceph_fs_ls', 'ceph_fs_status', 'ceph_fs_subvolumegroup_ls_ocs-storagecluster-cephfilesystem', 'ceph_health_detail', 'ceph_mds_stat', 'ceph_mgr_dump', 'ceph_mgr_module_ls', 'ceph_mgr_services', 'ceph_mon_dump', 'ceph_mon_stat', 'ceph_osd_blocked-by', 'ceph_osd_crush_class_ls', 'ceph_osd_crush_dump', 'ceph_osd_crush_rule_dump', 'ceph_osd_crush_rule_ls', 'ceph_osd_crush_show-tunables', 'ceph_osd_crush_weight-set_dump', 'ceph_osd_df', 'ceph_osd_df_tree', 'ceph_osd_dump', 'ceph_osd_getmaxosd', 'ceph_osd_lspools', 'ceph_osd_numa-status', 'ceph_osd_perf', 'ceph_osd_pool_ls_detail', 'ceph_osd_stat', 'ceph_osd_tree', 'ceph_osd_utilization', 'ceph_pg_dump', 'ceph_pg_stat', 'ceph_quorum_status', 'ceph_report', 'ceph_service_dump', 'ceph_status', 'ceph_time-sync-status', 'ceph_versions', 'ceph_df_detail'] E Exception: Files don't exist: E ['pools_rbd_ocs-storagecluster-cephblockpool'] 3.Check gather-debug.log: collecting prepare volume logs from node compute-2 ceph core dump collection completed ***skipping the ceph collection******** total time taken by collection was 319 seconds 4.Check helper pod status: namespaces/openshift-storage/oc_output/pods_-owide: must-gather-jh7kt-helper 0/1 CreateContainerError 0 4m30s 10.128.2.43 compute-0 <none> <none> mg: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2035774 Actual results: Expected results: Additional info:
Helper pod is in container creating state. If helper pod is not up then this is expected. The reason why must-gather-helper pod is not up: Warning Failed 4m30s (x2 over 4m30s) kubelet Error: container create failed: time="2021-12-27T13:19:21Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m29s kubelet Error: container create failed: time="2021-12-27T13:19:22Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m14s kubelet Error: container create failed: time="2021-12-27T13:19:37Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Sebastien, must-gather is still using "tini" https://github.com/red-hat-storage/ocs-operator/blob/d38316a811f30bffb3ce535bc6dda4ab5ee1dc3b/must-gather/templates/pod.template#L18 We already removed it via https://github.com/red-hat-storage/ocs-operator/pull/1406 for toolbox pod, we need to do the same for must-gather pod also.
Indeed, Subham PTAL.
The main branch still using tini https://github.com/red-hat-storage/ocs-operator/blob/main/must-gather/templates/pod.template#L18. I'll make the changes. Assigning to myself.
Bug reconstructed, must-gather-helper pod stuck on CreateContainerError state SetUp: OCP Version:4.10.0-0.nightly-2022-01-24-020644 ODF Version:full_version=4.10.0-115 Platform: Vmware Ceph versions: sh-4.4$ ceph versions { "mon": { "ceph version 16.2.7-32.el8cp (34a1b8b0c674a15f06e190b3f9c91ab84fd79cc6) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.7-32.el8cp (34a1b8b0c674a15f06e190b3f9c91ab84fd79cc6) pacific (stable)": 1 }, "osd": { "ceph version 16.2.7-32.el8cp (34a1b8b0c674a15f06e190b3f9c91ab84fd79cc6) pacific (stable)": 3 }, "mds": { "ceph version 16.2.7-32.el8cp (34a1b8b0c674a15f06e190b3f9c91ab84fd79cc6) pacific (stable)": 2 }, "rgw": { "ceph version 16.2.7-32.el8cp (34a1b8b0c674a15f06e190b3f9c91ab84fd79cc6) pacific (stable)": 1 }, "overall": { "ceph version 16.2.7-32.el8cp (34a1b8b0c674a15f06e190b3f9c91ab84fd79cc6) pacific (stable)": 10 } } Test Process: 1.Run MG command: $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.10 2.Check mg-helper pod status: $ oc get pods | grep helpe must-gather-m7s2j-helper 0/1 CreateContainerError 0 2m13s 3.Check mg dir content Files do not exist: ['ceph-volume_raw_list', 'ceph_auth_list', 'ceph_balancer_status', 'ceph_config-key_ls', 'ceph_config_dump', 'ceph_crash_stat', 'ceph_device_ls', 'ceph_fs_dump', 'ceph_fs_ls', 'ceph_fs_status', 'ceph_fs_subvolumegroup_ls_ocs-storagecluster-cephfilesystem', 'ceph_health_detail', 'ceph_mds_stat', 'ceph_mgr_dump', 'ceph_mgr_module_ls', 'ceph_mgr_services', 'ceph_mon_dump', 'ceph_mon_stat', 'ceph_osd_blocked-by', 'ceph_osd_crush_class_ls', 'ceph_osd_crush_dump', 'ceph_osd_crush_rule_dump', 'ceph_osd_crush_rule_ls', 'ceph_osd_crush_show-tunables', 'ceph_osd_crush_weight-set_dump', 'ceph_osd_df', 'ceph_osd_df_tree', 'ceph_osd_dump', 'ceph_osd_getmaxosd', 'ceph_osd_lspools', 'ceph_osd_numa-status', 'ceph_osd_perf', 'ceph_osd_pool_ls_detail', 'ceph_osd_stat', 'ceph_osd_tree', 'ceph_osd_utilization', 'ceph_pg_dump', 'ceph_pg_stat', 'ceph_quorum_status', 'ceph_report', 'ceph_service_dump', 'ceph_status', 'ceph_time-sync-status', 'ceph_versions', 'ceph_df_detail'] ['ceph_auth_list_--format_json-pretty', 'ceph_balancer_pool_ls_--format_json-pretty', 'ceph_balancer_status_--format_json-pretty', 'ceph_config-key_ls_--format_json-pretty', 'ceph_config_dump_--format_json-pretty', 'ceph_crash_ls_--format_json-pretty', 'ceph_crash_stat_--format_json-pretty', 'ceph_device_ls_--format_json-pretty', 'ceph_fs_dump_--format_json-pretty', 'ceph_fs_ls_--format_json-pretty', 'ceph_fs_status_--format_json-pretty', 'ceph_fs_subvolumegroup_ls_ocs-storagecluster-cephfilesystem_--format_json-pretty', 'ceph_health_detail_--format_json-pretty', 'ceph_mds_stat_--format_json-pretty', 'ceph_mgr_dump_--format_json-pretty', 'ceph_mgr_module_ls_--format_json-pretty', 'ceph_mgr_services_--format_json-pretty', 'ceph_mon_dump_--format_json-pretty', 'ceph_mon_stat_--format_json-pretty', 'ceph_osd_blacklist_ls_--format_json-pretty', 'ceph_osd_blocked-by_--format_json-pretty', 'ceph_osd_crush_class_ls_--format_json-pretty', 'ceph_osd_crush_dump_--format_json-pretty', 'ceph_osd_crush_rule_dump_--format_json-pretty', 'ceph_osd_crush_rule_ls_--format_json-pretty', 'ceph_osd_crush_show-tunables_--format_json-pretty', 'ceph_osd_crush_weight-set_dump_--format_json-pretty', 'ceph_osd_crush_weight-set_ls_--format_json-pretty', 'ceph_osd_df_--format_json-pretty', 'ceph_osd_df_tree_--format_json-pretty', 'ceph_osd_dump_--format_json-pretty', 'ceph_osd_getmaxosd_--format_json-pretty', 'ceph_osd_lspools_--format_json-pretty', 'ceph_osd_numa-status_--format_json-pretty', 'ceph_osd_perf_--format_json-pretty', 'ceph_osd_pool_ls_detail_--format_json-pretty', 'ceph_osd_stat_--format_json-pretty', 'ceph_osd_tree_--format_json-pretty', 'ceph_osd_utilization_--format_json-pretty', 'ceph_pg_dump_--format_json-pretty', 'ceph_pg_stat_--format_json-pretty', 'ceph_progress_--format_json-pretty', 'ceph_progress_json', 'ceph_progress_json_--format_json-pretty', 'ceph_quorum_status_--format_json-pretty', 'ceph_report_--format_json-pretty', 'ceph_service_dump_--format_json-pretty', 'ceph_status_--format_json-pretty', 'ceph_time-sync-status_--format_json-pretty', 'ceph_versions_--format_json-pretty', 'ceph_df_detail_--format_json-pretty']
Oded, can you help with the describe output of the helper pod or the complete must-gather?
MG: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/dnd-004ai1c33-s/dnd-004ai1c33-s_20220124T035144/logs/failed_testcase_ocs_logs_1642999729/test_multiple_pvc_creation_deletion_scale%5bReadWriteMany-cephfs%5d_ocs_logs/ocs_must_gather/ Describe helper pod: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m52s default-scheduler Successfully assigned openshift-storage/must-gather-f5nh9-helper to compute-0 Normal AddedInterface 4m50s multus Add eth0 [10.131.0.191/23] from openshift-sdn Warning Failed 4m50s kubelet Error: container create failed: time="2022-01-25T13:08:09Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m49s kubelet Error: container create failed: time="2022-01-25T13:08:10Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m48s kubelet Error: container create failed: time="2022-01-25T13:08:11Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m34s kubelet Error: container create failed: time="2022-01-25T13:08:25Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m23s kubelet Error: container create failed: time="2022-01-25T13:08:36Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 4m8s kubelet Error: container create failed: time="2022-01-25T13:08:51Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 3m53s kubelet Error: container create failed: time="2022-01-25T13:09:06Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 3m39s kubelet Error: container create failed: time="2022-01-25T13:09:20Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 3m24s kubelet Error: container create failed: time="2022-01-25T13:09:35Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 2m42s (x3 over 3m9s) kubelet (combined from similar events): Error: container create failed: time="2022-01-25T13:10:17Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Normal Pulled 2m27s (x13 over 4m50s) kubelet Container image "quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:deffe459757e10072fdec52c73534af903fa2815370d20bb777d3dd8a074e166" already present on machine
Looks like it is taking that last saved configuration http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/dnd-004ai1c33-s/dnd-004ai1c33-s_20220124T035144/logs/failed_testcase_ocs_logs_1642999729/test_multiple_pvc_creation_deletion_scale%5bReadWriteMany-cephfs%5d_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-128c52cbe4a2f7fe58ed16ea2a2de72a79534add1c7bac6f769397f62b5ab165/namespaces/openshift-storage/pods/must-gather-zswsq-helper/must-gather-zswsq-helper.yaml kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"must-gather-zswsq-helper","namespace":"openshift-storage"},"spec":{"containers":[{"args":["-g","--","/usr/local/bin/toolbox.sh"],"command":["/tini"],"env":[{"name":"ROOK_CEPH_USERNAME","valueFrom":{"secretKeyRef":{"key":"ceph-username","name":"rook-ceph-mon"}}},{"name":"ROOK_CEPH_SECRET","valueFrom":{"secretKeyRef":{"key":"ceph-secret","name":"rook-ceph-mon"}}}],"image":"quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:deffe459757e10072fdec52c73534af903fa2815370d20bb777d3dd8a074e166","imagePullPolicy":"IfNotPresent","name":"must-gather-helper","securityContext":{"privileged":true},"volumeMounts":[{"mountPath":"/dev","name":"dev"},{"mountPath":"/sys/bus","name":"sysbus"},{"mountPath":"/lib/modules","name":"libmodules"},{"mountPath":"/etc/rook","name":"mon-endpoint-volume"}]}],"tolerations":[{"effect":"NoSchedule","key":"node.ocs.openshift.io/storage","operator":"Equal","value":"true"}],"volumes":[{"hostPath":{"path":"/dev"},"name":"dev"},{"hostPath":{"path":"/sys/bus"},"name":"sysbus"},{"hostPath":{"path":"/lib/modules"},"name":"libmodules"},{"configMap":{"items":[{"key":"data","path":"mon-endpoints"}],"name":"rook-ceph-mon-endpoints"},"name":"mon-endpoint-volume"}]}} How do we make sure it takes the latest?
(In reply to Mudit Agarwal from comment #12) > Looks like it is taking that last saved configuration right, it looks like it picking old one ``` containers: - args: - -g - -- - /usr/local/bin/toolbox.sh command: - /tini env: ``` my pr removed tini > > http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/dnd- > 004ai1c33-s/dnd-004ai1c33-s_20220124T035144/logs/ > failed_testcase_ocs_logs_1642999729/ > test_multiple_pvc_creation_deletion_scale%5bReadWriteMany-cephfs%5d_ocs_logs/ > ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256- > 128c52cbe4a2f7fe58ed16ea2a2de72a79534add1c7bac6f769397f62b5ab165/namespaces/ > openshift-storage/pods/must-gather-zswsq-helper/must-gather-zswsq-helper. > yaml > > kubectl.kubernetes.io/last-applied-configuration: | > > {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"must- > gather-zswsq-helper","namespace":"openshift-storage"},"spec":{"containers": > [{"args":["-g","--","/usr/local/bin/toolbox.sh"],"command":["/tini"],"env": > [{"name":"ROOK_CEPH_USERNAME","valueFrom":{"secretKeyRef":{"key":"ceph- > username","name":"rook-ceph-mon"}}},{"name":"ROOK_CEPH_SECRET","valueFrom": > {"secretKeyRef":{"key":"ceph-secret","name":"rook-ceph-mon"}}}],"image": > "quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256: > deffe459757e10072fdec52c73534af903fa2815370d20bb777d3dd8a074e166", > "imagePullPolicy":"IfNotPresent","name":"must-gather-helper", > "securityContext":{"privileged":true},"volumeMounts":[{"mountPath":"/dev", > "name":"dev"},{"mountPath":"/sys/bus","name":"sysbus"},{"mountPath":"/lib/ > modules","name":"libmodules"},{"mountPath":"/etc/rook","name":"mon-endpoint- > volume"}]}],"tolerations":[{"effect":"NoSchedule","key":"node.ocs.openshift. > io/storage","operator":"Equal","value":"true"}],"volumes":[{"hostPath": > {"path":"/dev"},"name":"dev"},{"hostPath":{"path":"/sys/bus"},"name": > "sysbus"},{"hostPath":{"path":"/lib/modules"},"name":"libmodules"}, > {"configMap":{"items":[{"key":"data","path":"mon-endpoints"}],"name":"rook- > ceph-mon-endpoints"},"name":"mon-endpoint-volume"}]}} > > How do we make sure it takes the latest?
try to remove quay.io/rhceph-dev/ocs-must-gather:latest-4.10 and test again to make sure it is picking the latest build. Thanks
Bug not fixed Setup: OCP version:4.10.0-0.nightly-2022-01-29-215708 ODF Version:4.10.0-128 Provider:Vmware Test Process: 1.Run MG: oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.10 --dest-dir=/tmp/tmpbnvcqsja_ocs_logs/ocs_must_gather 2.gather-debug.log waiting for 1436 1437 1450 1451 1476 1477 to terminate collecting crash core dump from node compute-2 collecting prepare volume logs from node compute-2 ceph core dump collection completed skipping the ceph collection 3.Check MG content: Exception: Files don't exist: ['ceph_auth_list', 'ceph_balancer_status', 'ceph_config-key_ls', 'ceph_config_dump', 'ceph_crash_stat', 'ceph_device_ls', 'ceph_fs_dump', 'ceph_fs_ls', 'ceph_fs_status', 'ceph_fs_subvolumegroup_ls_ocs-storagecluster-cephfilesystem', 'ceph_health_detail', 'ceph_mds_stat', 'ceph_mgr_dump', 'ceph_mgr_module_ls', 'ceph_mgr_services', 'ceph_mon_dump', 'ceph_mon_stat', 'ceph_osd_blocked-by', 'ceph_osd_crush_class_ls', 'ceph_osd_crush_dump', 'ceph_osd_crush_rule_dump', 'ceph_osd_crush_rule_ls', 'ceph_osd_crush_show-tunables', 'ceph_osd_crush_weight-set_dump', 'ceph_osd_df', 'ceph_osd_df_tree', 'ceph_osd_dump', 'ceph_osd_getmaxosd', 'ceph_osd_lspools', 'ceph_osd_numa-status', 'ceph_osd_perf', 'ceph_osd_pool_ls_detail', 'ceph_osd_stat', 'ceph_osd_tree', 'ceph_osd_utilization', 'ceph_pg_dump', 'ceph_pg_stat', 'ceph_quorum_status', 'ceph_report', 'ceph_service_dump', 'ceph_status', 'ceph_time-sync-status', 'ceph_versions', 'ceph_df_detail'] ['ceph_auth_list_--format_json-pretty', 'ceph_balancer_pool_ls_--format_json-pretty', 'ceph_balancer_status_--format_json-pretty', 'ceph_config-key_ls_--format_json-pretty', 'ceph_config_dump_--format_json-pretty', 'ceph_crash_ls_--format_json-pretty', 'ceph_crash_stat_--format_json-pretty', 'ceph_device_ls_--format_json-pretty', 'ceph_fs_dump_--format_json-pretty', 'ceph_fs_ls_--format_json-pretty', 'ceph_fs_status_--format_json-pretty', 'ceph_fs_subvolumegroup_ls_ocs-storagecluster-cephfilesystem_--format_json-pretty', 'ceph_health_detail_--format_json-pretty', 'ceph_mds_stat_--format_json-pretty', 'ceph_mgr_dump_--format_json-pretty', 'ceph_mgr_module_ls_--format_json-pretty', 'ceph_mgr_services_--format_json-pretty', 'ceph_mon_dump_--format_json-pretty', 'ceph_mon_stat_--format_json-pretty', 'ceph_osd_blacklist_ls_--format_json-pretty', 'ceph_osd_blocked-by_--format_json-pretty', 'ceph_osd_crush_class_ls_--format_json-pretty', 'ceph_osd_crush_dump_--format_json-pretty', 'ceph_osd_crush_rule_dump_--format_json-pretty', 'ceph_osd_crush_rule_ls_--format_json-pretty', 'ceph_osd_crush_show-tunables_--format_json-pretty', 'ceph_osd_crush_weight-set_dump_--format_json-pretty', 'ceph_osd_crush_weight-set_ls_--format_json-pretty', 'ceph_osd_df_--format_json-pretty', 'ceph_osd_df_tree_--format_json-pretty', 'ceph_osd_dump_--format_json-pretty', 'ceph_osd_getmaxosd_--format_json-pretty', 'ceph_osd_lspools_--format_json-pretty', 'ceph_osd_numa-status_--format_json-pretty', 'ceph_osd_perf_--format_json-pretty', 'ceph_osd_pool_ls_detail_--format_json-pretty', 'ceph_osd_stat_--format_json-pretty', 'ceph_osd_tree_--format_json-pretty', 'ceph_osd_utilization_--format_json-pretty', 'ceph_pg_dump_--format_json-pretty', 'ceph_pg_stat_--format_json-pretty', 'ceph_progress_--format_json-pretty', 'ceph_progress_json', 'ceph_progress_json_--format_json-pretty', 'ceph_quorum_status_--format_json-pretty', 'ceph_report_--format_json-pretty', 'ceph_service_dump_--format_json-pretty', 'ceph_status_--format_json-pretty', 'ceph_time-sync-status_--format_json-pretty', 'ceph_versions_--format_json-pretty', 'ceph_df_detail_--format_json-pretty'] 4.Get MG helper pod status: $ oc get pods | grep helper must-gather-7s4xl-helper 0/1 CreateContainerError 0 10s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m20s default-scheduler Successfully assigned openshift-storage/must-gather-7s4xl-helper to compute-0 Normal AddedInterface 3m18s multus Add eth0 [10.129.2.49/23] from openshift-sdn Warning Failed 3m18s kubelet Error: container create failed: time="2022-01-30T11:55:34Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 3m17s kubelet Error: container create failed: time="2022-01-30T11:55:35Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 3m3s kubelet Error: container create failed: time="2022-01-30T11:55:49Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 2m50s kubelet Error: container create failed: time="2022-01-30T11:56:02Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 2m35s kubelet Error: container create failed: time="2022-01-30T11:56:17Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 2m20s kubelet Error: container create failed: time="2022-01-30T11:56:32Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 2m5s kubelet Error: container create failed: time="2022-01-30T11:56:47Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 114s kubelet Error: container create failed: time="2022-01-30T11:56:58Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 103s kubelet Error: container create failed: time="2022-01-30T11:57:09Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 60s (x3 over 88s) kubelet (combined from similar events): Error: container create failed: time="2022-01-30T11:57:52Z" level=error msg="container_linux.go:380: starting container process caused: exec: \"/tini\": stat /tini: no such file or directory" Normal Pulled 47s (x13 over 3m18s) kubelet Container image "quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:553b332e4ae53869f99593621d5e25c889f1907cc7babb11dca6ad61701499c5" already present on machine MG dir: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-034ai3c33-s/j-034ai3c33-s_20220126T021925/logs/failed_testcase_ocs_logs_1643167244/test_multiple_pvc_creation_deletion_scale%5bReadWriteMany-CephBlockPool%5d_ocs_logs/ocs_must_gather/
Did you follow the instructions provided in https://bugzilla.redhat.com/show_bug.cgi?id=2035774#c14? Did you remove the must-gather image from your system and confirmed that it is pulling it from quay? I guess not, looks like it used the image already present on the machine, see this message: >> Container image "quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:553b332e4ae53869f99593621d5e25c889f1907cc7babb11dca6ad61701499c5" already present on machine Please remove the must-gather image from your system, make sure that while creating the helper pod it pull it from the quay repo and then if you still see the issue move it back to ASSIGNED.
ocs-must-gather-latest-4.10 image modified 2 months ago https://quay.io/repository/rhceph-dev/ocs-must-gather?tab=tags We need to update the image
Bug fixed. SetUp: ODF Version:4.10.0-143 OCP Version:4.10.0-0.nightly-2022-02-02-220834 Platform:AWS Test Process: 1.Run mg command: oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.10 2.Check content on mg dir
MG Dir: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-029au3r33-d/j-029au3r33-d_20220203T120203/logs/deployment_1643890080/ocs_must_gather/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372