Bug 2096395 - [GSS] F I0519 1 mount_linux.go:175] systemd-run output: System has not been booted with systemd as init system (PID 1). Can't operate. [NEEDINFO]
Summary: [GSS] F I0519 1 mount_linux.go:175] systemd-run output: System has not b...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-driver
Version: 4.8
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ODF 4.12.0
Assignee: Niels de Vos
QA Contact: Yosi Ben Shimon
URL:
Whiteboard:
: 2105963 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-13 17:55 UTC by Levy Sant'Anna
Modified: 2023-08-09 16:37 UTC (History)
15 users (show)

Fixed In Version: 4.12.0-74
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-08 14:06:28 UTC
Embargoed:
etamir: needinfo? (cgaynor)


Attachments (Terms of Use)
logs containing NodeStageVolume and NodeVolumeGetStats calls (12.98 KB, text/plain)
2022-07-18 10:24 UTC, Niels de Vos
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-csi pull 3225 0 None Merged logging: reduce systemd related errors while mounting 2022-10-31 14:04:45 UTC
Github kubernetes kubernetes pull 111083 0 None Merged mount-utils: only detect systemd when needed 2022-10-31 14:04:48 UTC
Github red-hat-storage ocs-ci pull 6836 0 None Merged BZ #2096395 automation. 2023-01-02 12:26:35 UTC

Comment 5 Michal Sekletar 2022-06-27 12:33:16 UTC
This issue was previously reported in Ceph CSI plugin upstream and was closed as NOTABUG,
https://github.com/ceph/ceph-csi/issues/2890

At any rate, this is not a systemd bug so let's move this to OCP for further analysis.

Comment 6 Jan Safranek 2022-06-28 13:59:35 UTC
Moving to ODF, they maintain the CephFS CSI driver.

Comment 7 Niels de Vos 2022-06-29 13:28:36 UTC
(In reply to Levy Sant'Anna from comment #0)
> Description of problem:
> CephFS CSI plugin pod can't mount RBD PVC due systemd error:

The systemd error is not the cause, the error that is reported is a warning, and mounting is retried a different way after that.

As Michal mentions in comment #5, this is not a bug in Ceph-CSI, rather confusing logging from the https://github.com/kubernetes/mount-utils package.


We'll need more logs to identify the real error. Please provide the full logs of the NodeStagsVolume procedure, or ideally an ocs-must-gather archive for our inspection.

Comment 14 Niels de Vos 2022-07-08 07:55:35 UTC
Backport https://github.com/red-hat-storage/ceph-csi/pull/110 will be included in ODF-4.12

This reduces the number of confusing systemd errors, but only for the NodeStageVolume call, not NodeGetVolumeStats. Some of the non-errors will likely still get reported. I expect that the support team documents this in one of their KCS solutions so that customers can find details about the messages and won't be concerned about it.

Comment 18 Niels de Vos 2022-07-12 06:49:30 UTC
*** Bug 2105963 has been marked as a duplicate of this bug. ***

Comment 20 Niels de Vos 2022-07-12 08:49:51 UTC
https://github.com/kubernetes/kubernetes/pull/111083 has been posted for inclusion into the k8s.io/mount-utils package. That package is responsible for the scary logging that is included in Ceph-CSI. Once that PR is merged, and a new Kubernetes release contains the change, Ceph-CSI can update the package and the logs should be gone in the NodeGetVolumeStats procedure too.

Comment 25 Niels de Vos 2022-07-15 10:56:23 UTC
A test-image with updated (current master branch) k8s.io/mount-utils package is available at quay.io/nixpanic/cephcsi:no-systemd
This container-image is based on current Ceph-CSI upstream, but that should not matter for checking the logs while mounting and gathering stats.

Comment 26 Niels de Vos 2022-07-18 10:24:43 UTC
Created attachment 1897901 [details]
logs containing NodeStageVolume and NodeVolumeGetStats calls

This is the log from a csi-cephfsplugin container that was used for NodeStageVolume (mount) and NodeGetVolumeStats (Kubelet metrics) procedures. With the test container-image that contains the updated k8s.io/mount-utils package, no scary systemd messages are logged.

Comment 27 Niels de Vos 2022-07-27 09:08:27 UTC
This bug was reported for (confusing) log messages in the CephFS components of the Ceph-CSI driver. Everything to remove those has been included in the upcoming builds of ODF-4.12.

Enhancements for the same log messages in RBD have been proposed, but those changes still need to get included in the Kubernetes project before it can be addressed by Ceph-CSI.

From my point of view, this BZ can be veriefied by QE on CephFS once a ODF-4.12 build is available.

Comment 32 Niels de Vos 2022-10-31 14:03:10 UTC
To verify the fixes:

1. create a CephFS PVC
2. use the CephFS PVC with a Pod to have it mounted
3. identify the csi-cephfsplugin DaemonSet running on the worker node
4. gather the logs of the csi-cephfsplugin container of the DaemonSet

The logs of the csi-cephfsplugin (`cephcsi` executable) contain the NodeStageVolume and NodeGetVolumeStats for the volume. The logs with the request ID for the procedures should not contain any of the warning messages.

Note that the NodeGetVolumeStats procedure is executed by Kubelet at some interval. If the log does not contain a NodeGetVolumeStats call, leave the Pod running for a little longer and check again.

Comment 33 Yosi Ben Shimon 2022-11-06 10:32:27 UTC
Tested on 4.12.091 following the steps in comment #32.
- "NodeStageVolume" and "NodeGetVolumeStats" were found in the logs.
- None of the warning messages was found

I1106 07:37:28.123886       1 cephcsi.go:192] Driver version: release-4.12 and Git version: 9d4878f8febf39a98fb9e188ccabed6f6d6b8849
I1106 07:37:28.124215       1 cephcsi.go:210] Initial PID limit is set to 205116
I1106 07:37:28.124264       1 cephcsi.go:216] Reconfigured PID limit to -1 (max)
I1106 07:37:28.124441       1 cephcsi.go:241] Starting driver type: cephfs with name: openshift-storage.cephfs.csi.ceph.com
I1106 07:37:28.138578       1 volumemounter.go:79] loaded mounter: kernel
E1106 07:37:28.138610       1 volumemounter.go:88] failed to run ceph-fuse exec: "ceph-fuse": executable file not found in $PATH
I1106 07:37:28.142229       1 mount_linux.go:283] Detected umount with safe 'not mounted' behavior
I1106 07:37:28.142702       1 server.go:126] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}
I1106 07:37:28.832224       1 utils.go:195] ID: 1 GRPC call: /csi.v1.Identity/GetPluginInfo
I1106 07:37:28.835823       1 utils.go:206] ID: 1 GRPC request: {}
I1106 07:37:28.835855       1 identityserver-default.go:39] ID: 1 Using default GetPluginInfo
I1106 07:37:28.835918       1 utils.go:212] ID: 1 GRPC response: {"name":"openshift-storage.cephfs.csi.ceph.com","vendor_version":"release-4.12"}
I1106 07:37:29.270274       1 utils.go:195] ID: 2 GRPC call: /csi.v1.Node/NodeGetInfo
I1106 07:37:29.270344       1 utils.go:206] ID: 2 GRPC request: {}
I1106 07:37:29.270358       1 nodeserver-default.go:51] ID: 2 Using default NodeGetInfo
I1106 07:37:29.270565       1 utils.go:212] ID: 2 GRPC response: {"accessible_topology":{},"node_id":"compute-2"}
I1106 07:43:22.263014       1 utils.go:195] ID: 3 GRPC call: /csi.v1.Node/NodeGetCapabilities0001-0011-openshift-storage-0000000000000001
I1106 07:43:22.263098       1 utils.go:206] ID: 3 GRPC request: {}
I1106 07:43:22.263310       1 utils.go:212] ID: 3 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1106 07:43:22.277231       1 utils.go:195] ID: 4 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1106 07:43:22.277300       1 utils.go:206] ID: 4 GRPC request: {}
I1106 07:43:22.277437       1 utils.go:212] ID: 4 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1106 07:43:22.278177       1 utils.go:195] ID: 5 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1106 07:43:22.278220       1 utils.go:206] ID: 5 GRPC request: {}
I1106 07:43:22.278292       1 utils.go:212] ID: 5 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1106 07:43:22.279347       1 utils.go:195] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1 GRPC call: /csi.v1.Node/NodeStageVolume
I1106 07:43:22.279598       1 utils.go:206] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/40e808aa3c3a608c051a109552d92adba56adf4638592416152e634758c62149/globalmount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","storage.kubernetes.io/csiProvisionerIdentity":"1667720230914-8081-openshift-storage.cephfs.csi.ceph.com","subvolumeName":"csi-vol-7c13ed73-9cb3-4572-841d-d44070fa82d1","subvolumePath":"/volumes/csi/csi-vol-7c13ed73-9cb3-4572-841d-d44070fa82d1/ad28caaa-b9f8-4cbc-81a5-bd9069ed439f"},"volume_id":"0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1"}
I1106 07:43:22.316955       1 omap.go:88] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.7c13ed73-9cb3-4572-841d-d44070fa82d1"): map[csi.imagename:csi-vol-7c13ed73-9cb3-4572-841d-d44070fa82d1 csi.volname:pvc-060da5f3-97c8-485d-b198-3b648c888350]
I1106 07:43:22.324822       1 volumemounter.go:126] requested mounter: , chosen mounter: kernel


...

I1106 07:44:57.782401       1 utils.go:212] ID: 11 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1106 07:44:57.783260       1 utils.go:195] ID: 12 GRPC call: /csi.v1.Node/NodeGetVolumeStats
I1106 07:44:57.783374       1 utils.go:206] ID: 12 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1","volume_path":"/var/lib/kubelet/pods/b1b59db9-af54-4999-8a4e-abaa6f333c01/volumes/kubernetes.io~csi/pvc-060da5f3-97c8-485d-b198-3b648c888350/mount"}
I1106 07:44:57.784616       1 utils.go:212] ID: 12 GRPC response: {"usage":[{"available":107374182400,"total":107374182400,"unit":1}]}
I1106 07:46:45.640554       1 utils.go:195] ID: 13 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1106 07:46:45.640609       1 utils.go:206] ID: 13 GRPC request: {}
I1106 07:46:45.640682       1 utils.go:212] ID: 13 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1106 07:46:45.641549       1 utils.go:195] ID: 14 GRPC call: /csi.v1.Node/NodeGetVolumeStats
I1106 07:46:45.641606       1 utils.go:206] ID: 14 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1","volume_path":"/var/lib/kubelet/pods/b1b59db9-af54-4999-8a4e-abaa6f333c01/volumes/kubernetes.io~csi/pvc-060da5f3-97c8-485d-b198-3b648c888350/mount"}
I1106 07:46:45.643093       1 utils.go:212] ID: 14 GRPC response: {"usage":[{"available":107374182400,"total":107374182400,"unit":1}]}
I1106 07:47:58.306781       1 utils.go:195] ID: 15 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1106 07:47:58.306841       1 utils.go:206] ID: 15 GRPC request: {}
I1106 07:47:58.306935       1 utils.go:212] ID: 15 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1106 07:47:58.307910       1 utils.go:195] ID: 16 GRPC call: /csi.v1.Node/NodeGetVolumeStats
I1106 07:47:58.307970       1 utils.go:206] ID: 16 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-7c13ed73-9cb3-4572-841d-d44070fa82d1","volume_path":"/var/lib/kubelet/pods/b1b59db9-af54-4999-8a4e-abaa6f333c01/volumes/kubernetes.io~csi/pvc-060da5f3-97c8-485d-b198-3b648c888350/mount"}


Moving to VERIFIED


Note You need to log in before you can comment on or make changes to this bug.