Bug 2133388
| Summary: | [ACM 2.7 Tracker] [RDR][CEPHFS] volsync-rsync-src pod's are stuck in CreateContainerError on primary site | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Pratik Surve <prsurve> |
| Component: | odf-dr | Assignee: | Benamar Mekhissi <bmekhiss> |
| odf-dr sub component: | ramen | QA Contact: | Pratik Surve <prsurve> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | unspecified | CC: | bmekhiss, ebenahar, madam, mrajanna, muagarwa, ocs-bugs, odf-bz-bot |
| Version: | 4.12 | Keywords: | TestBlocker |
| Target Milestone: | --- | ||
| Target Release: | ODF 4.12.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-01-31 00:19:51 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Pratik Surve
2022-10-10 07:59:21 UTC
oc get po volsync-rsync-src-dd-io-pvc-1-jsmhm -nbusybox-workloads-8 -oyaml
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.135.1.62"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.135.1.62"
],
"default": true,
"dns": {}
}]
openshift.io/scc: privileged
creationTimestamp: "2022-10-10T20:38:59Z"
finalizers:
- batch.kubernetes.io/job-tracking
generateName: volsync-rsync-src-dd-io-pvc-1-
labels:
app.kubernetes.io/component: rsync-mover
app.kubernetes.io/created-by: volsync
app.kubernetes.io/name: src-dd-io-pvc-1
app.kubernetes.io/part-of: volsync
controller-uid: bf43332a-2284-46f2-918a-cc2beef1c033
job-name: volsync-rsync-src-dd-io-pvc-1
name: volsync-rsync-src-dd-io-pvc-1-jsmhm
namespace: busybox-workloads-8
ownerReferences:
- apiVersion: batch/v1
blockOwnerDeletion: true
controller: true
kind: Job
name: volsync-rsync-src-dd-io-pvc-1
uid: bf43332a-2284-46f2-918a-cc2beef1c033
resourceVersion: "4314657"
uid: ffa3d2f3-5528-40f4-bf96-005612ae925c
spec:
containers:
- command:
- /bin/bash
- -c
- /source.sh
env:
- name: DESTINATION_ADDRESS
value: volsync-rsync-dst-dd-io-pvc-1.busybox-workloads-8.svc.clusterset.local
image: registry.redhat.io/rhacm2/volsync-mover-rsync-rhel8@sha256:35e3cbedcc3c558f484f743ead5ba14a2a1ebe55751f6ff4975a3d8b00e61b54
imagePullPolicy: IfNotPresent
name: rsync
resources: {}
securityContext:
capabilities:
add:
- AUDIT_WRITE
- SYS_CHROOT
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
- mountPath: /keys
name: keys
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-2ph6k
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
imagePullSecrets:
- name: volsync-src-dd-io-pvc-1-dockercfg-q2hcs
nodeName: compute-2
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccount: volsync-src-dd-io-pvc-1
serviceAccountName: volsync-src-dd-io-pvc-1
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: data
persistentVolumeClaim:
claimName: volsync-dd-io-pvc-1-src
- name: keys
secret:
defaultMode: 384
secretName: busybox-drpc-vs-secret
- name: kube-api-access-2ph6k
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- configMap:
items:
- key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-10-10T20:39:00Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-10-10T20:39:00Z"
message: 'containers with unready status: [rsync]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2022-10-10T20:39:00Z"
message: 'containers with unready status: [rsync]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2022-10-10T20:39:00Z"
status: "True"
type: PodScheduled
containerStatuses:
- image: registry.redhat.io/rhacm2/volsync-mover-rsync-rhel8@sha256:35e3cbedcc3c558f484f743ead5ba14a2a1ebe55751f6ff4975a3d8b00e61b54
imageID: ""
lastState: {}
name: rsync
ready: false
restartCount: 0
started: false
state:
waiting:
message: 'relabel failed /var/lib/kubelet/pods/ffa3d2f3-5528-40f4-bf96-005612ae925c/volumes/kubernetes.io~csi/pvc-b967a92c-370b-4078-968c-ca58a188ea11/mount:
lsetxattr /var/lib/kubelet/pods/ffa3d2f3-5528-40f4-bf96-005612ae925c/volumes/kubernetes.io~csi/pvc-b967a92c-370b-4078-968c-ca58a188ea11/mount/10-10-2022_07-13-12-dd-io-1-5857bfdcd9-qvl9m:
read-only file system'
reason: CreateContainerError
hostIP: 10.70.53.37
phase: Pending
podIP: 10.135.1.62
podIPs:
- ip: 10.135.1.62
qosClass: BestEffort
startTime: "2022-10-10T20:39:00Z"
[🎩︎]mrajanna@fedora ocs-operator $]oc get pvc volsync-dd-io-pvc-1-src -nbusybox-workloads-8
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
volsync-dd-io-pvc-1-src Bound pvc-b967a92c-370b-4078-968c-ca58a188ea11 117Gi ROX ocs-storagecluster-cephfs-vrg 24h
[🎩︎]mrajanna@fedora ocs-operator $]oc get pv/pvc-b967a92c-370b-4078-968c-ca58a188ea11 -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: openshift-storage.cephfs.csi.ceph.com
volume.kubernetes.io/provisioner-deletion-secret-name: rook-csi-cephfs-provisioner
volume.kubernetes.io/provisioner-deletion-secret-namespace: openshift-storage
creationTimestamp: "2022-10-10T07:14:59Z"
finalizers:
- kubernetes.io/pv-protection
name: pvc-b967a92c-370b-4078-968c-ca58a188ea11
resourceVersion: "2044710"
uid: 08cf1639-2ff9-4522-84e4-260d5d5273eb
spec:
accessModes:
- ReadOnlyMany
capacity:
storage: 117Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: volsync-dd-io-pvc-1-src
namespace: busybox-workloads-8
resourceVersion: "2044683"
uid: b967a92c-370b-4078-968c-ca58a188ea11
csi:
controllerExpandSecretRef:
name: rook-csi-cephfs-provisioner
namespace: openshift-storage
driver: openshift-storage.cephfs.csi.ceph.com
nodeStageSecretRef:
name: rook-csi-cephfs-node
namespace: openshift-storage
volumeAttributes:
backingSnapshot: "true"
clusterID: openshift-storage
fsName: ocs-storagecluster-cephfilesystem
storage.kubernetes.io/csiProvisionerIdentity: 1665376730267-8081-openshift-storage.cephfs.csi.ceph.com
subvolumeName: csi-vol-c4dbe41a-5070-4a89-b434-6a3b4258bb91
subvolumePath: /volumes/csi/csi-vol-dc03ff05-2745-4fc5-a507-a2a40b4cfbd0/
volumeHandle: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91
persistentVolumeReclaimPolicy: Delete
storageClassName: ocs-storagecluster-cephfs-vrg
volumeMode: Filesystem
status:
phase: Bound
[🎩︎]mrajanna@fedora ocs-operator $]oc get pvc volsync-dd-io-pvc-1-src -nbusybox-workloads-8 -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com
volume.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com
creationTimestamp: "2022-10-10T07:14:59Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app.kubernetes.io/created-by: volsync
volsync.backube/cleanup: 086601ef-da88-4556-b339-84125f1ab0c4
name: volsync-dd-io-pvc-1-src
namespace: busybox-workloads-8
ownerReferences:
- apiVersion: volsync.backube/v1alpha1
blockOwnerDeletion: true
controller: true
kind: ReplicationSource
name: dd-io-pvc-1
uid: 086601ef-da88-4556-b339-84125f1ab0c4
resourceVersion: "2044716"
uid: b967a92c-370b-4078-968c-ca58a188ea11
spec:
accessModes:
- ReadOnlyMany
dataSource:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: volsync-dd-io-pvc-1-src
dataSourceRef:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: volsync-dd-io-pvc-1-src
resources:
requests:
storage: 117Gi
storageClassName: ocs-storagecluster-cephfs-vrg
volumeMode: Filesystem
volumeName: pvc-b967a92c-370b-4078-968c-ca58a188ea11
status:
accessModes:
- ReadOnlyMany
capacity:
storage: 117Gi
phase: Bound
cat log | grep 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91
I1010 20:39:05.970547 1 utils.go:195] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 GRPC call: /csi.v1.Node/NodeStageVolume
I1010 20:39:05.970809 1 utils.go:206] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":3}},"volume_context":{"backingSnapshot":"true","clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","storage.kubernetes.io/csiProvisionerIdentity":"1665376730267-8081-openshift-storage.cephfs.csi.ceph.com","subvolumeName":"csi-vol-c4dbe41a-5070-4a89-b434-6a3b4258bb91","subvolumePath":"/volumes/csi/csi-vol-dc03ff05-2745-4fc5-a507-a2a40b4cfbd0/"},"volume_id":"0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91"}
I1010 20:39:05.979361 1 omap.go:88] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.c4dbe41a-5070-4a89-b434-6a3b4258bb91"): map[csi.imagename:csi-vol-c4dbe41a-5070-4a89-b434-6a3b4258bb91 csi.volname:pvc-b967a92c-370b-4078-968c-ca58a188ea11 csi.volume.backingsnapshotid:0001-0011-openshift-storage-0000000000000001-09476a54-7e65-4a89-88c1-a3ae09a23fff]
I1010 20:39:06.047222 1 omap.go:88] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.snap.09476a54-7e65-4a89-88c1-a3ae09a23fff"): map[csi.imagename:csi-snap-09476a54-7e65-4a89-88c1-a3ae09a23fff csi.snapname:snapshot-6b965c7e-9e14-48ec-88d3-6b779538ff7a csi.source:csi-vol-dc03ff05-2745-4fc5-a507-a2a40b4cfbd0]
I1010 20:39:06.340958 1 nodeserver.go:247] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 cephfs: mounting volume 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 with Ceph kernel client
I1010 20:39:06.371611 1 cephcmds.go:105] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 command succeeded: modprobe [ceph]
I1010 20:39:06.808022 1 cephcmds.go:105] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 command succeeded: mount [-t ceph 172.31.227.77:6789,172.31.110.208:6789,172.31.109.117:6789:/volumes/csi/csi-vol-dc03ff05-2745-4fc5-a507-a2a40b4cfbd0/ /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount -o name=csi-cephfs-node,secretfile=/tmp/csi/keys/keyfile-3523890341,mds_namespace=ocs-storagecluster-cephfilesystem,ro,_netdev]
I1010 20:39:06.821912 1 cephcmds.go:105] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 command succeeded: mount [-o bind,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount/.snap/csi-snap-09476a54-7e65-4a89-88c1-a3ae09a23fff/fbbc4c34-1178-49a6-9d73-af074f7c18c0 /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount]
I1010 20:39:06.828267 1 cephcmds.go:105] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 command succeeded: mount [-o bind,_netdev,remount /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount]
I1010 20:39:06.828321 1 nodeserver.go:206] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 cephfs: successfully mounted volume 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 to /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount
I1010 20:39:06.828361 1 utils.go:212] ID: 4344 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 GRPC response: {}
I1010 20:39:06.846934 1 utils.go:195] ID: 4348 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 GRPC call: /csi.v1.Node/NodePublishVolume
I1010 20:39:06.847044 1 utils.go:206] ID: 4348 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount","target_path":"/var/lib/kubelet/pods/ffa3d2f3-5528-40f4-bf96-005612ae925c/volumes/kubernetes.io~csi/pvc-b967a92c-370b-4078-968c-ca58a188ea11/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":3}},"volume_context":{"backingSnapshot":"true","clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","storage.kubernetes.io/csiProvisionerIdentity":"1665376730267-8081-openshift-storage.cephfs.csi.ceph.com","subvolumeName":"csi-vol-c4dbe41a-5070-4a89-b434-6a3b4258bb91","subvolumePath":"/volumes/csi/csi-vol-dc03ff05-2745-4fc5-a507-a2a40b4cfbd0/"},"volume_id":"0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91"}
I1010 20:39:06.852058 1 cephcmds.go:105] ID: 4348 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 command succeeded: mount [-o bind,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/6563ebbf4428a39eb6748cf46d617b9c16e607b747e6231b8cf3951d144253a1/globalmount /var/lib/kubelet/pods/ffa3d2f3-5528-40f4-bf96-005612ae925c/volumes/kubernetes.io~csi/pvc-b967a92c-370b-4078-968c-ca58a188ea11/mount]
I1010 20:39:06.852106 1 nodeserver.go:467] ID: 4348 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 cephfs: successfully bind-mounted volume 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 to /var/lib/kubelet/pods/ffa3d2f3-5528-40f4-bf96-005612ae925c/volumes/kubernetes.io~csi/pvc-b967a92c-370b-4078-968c-ca58a188ea11/mount
I1010 20:39:06.852139 1 utils.go:212] ID: 4348 Req-ID: 0001-0011-openshift-storage-0000000000000001-c4dbe41a-5070-4a89-b434-6a3b4258bb91 GRPC response: {}
I can see nothing failed at the CSI level.
> Warning Failed 33m (x10 over 35m) kubelet Error: relabel failed /var/lib/kubelet/pods/afa15ec8-0395-4544-a90a-dbf7762495da/volumes/kubernetes.io~csi/pvc-a773e380-8b0d-4294-952e-7a59fd99f547/mount: lsetxattr /var/lib/kubelet/pods/afa15ec8-0395-4544-a90a-dbf7762495da/volumes/kubernetes.io~csi/pvc-a773e380-8b0d-4294-952e-7a59fd99f547/mount/10-10-2022_07-13-54-dd-io-3-5d6b4b84df-wj2df: read-only file system
Looks like this is coming from the kubelet and selinux relabel is getting failed as the cephfs snapshot/subvolume is mounted as the readonly.
The selinux label should not be getting changed after the failover isnt it?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.12.0 enhancement and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:0551 |