Bug 1565561 - [CephFs] Input/output error on provisioned CephFS volumes
Summary: [CephFs] Input/output error on provisioned CephFS volumes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.0
Assignee: hchen
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-10 10:08 UTC by Jianwei Hou
Modified: 2018-07-30 19:13 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-30 19:12:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 0 None None None 2018-07-30 19:13:02 UTC

Description Jianwei Hou 2018-04-10 10:08:39 UTC
Description of problem:
Unable to write to the container mount path of cephfs volume, got "Input/output error"


Version-Release number of selected component (if applicable):
openshift v3.10.0-0.16.0
kubernetes v1.9.1+a0ce1bc657

How reproducible:
Always

Steps to Reproduce:
1. Create CephFS storageclass, PVC. PV is provisioned
2. Create Pod
3. oc rsh pod, try creating a file in the mount path

Actual results:
/mnt/cephfs # touch ls
touch: ls: Input/output error

/mnt/cephfs # dmesg |tail
[1056542.414236] libceph: mon0 xxx:6789 session established
[1056542.415566] libceph: client14176 fsid 310b6b01-99fe-43fb-88f5-efaa9317515e
[1056542.599772] SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)

Expected results:
Do not see "Input/output error"

Master Log:

Node Log (of failed PODs):

PV Dump:
---
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    cephFSProvisionerIdentity: ceph.com/cephfs
    cephShare: kubernetes-dynamic-pvc-bc6d0ed5-3c94-11e8-a12a-0a580a800010
    pv.kubernetes.io/provisioned-by: ceph.com/cephfs
  creationTimestamp: null
  name: pvc-bc5a83e5-3c94-11e8-a6af-0050569f41c6
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 1Gi
  cephfs:
    monitors:
    - xxx:6789
    path: /volumes/kubernetes/kubernetes-dynamic-pvc-bc6d0ed5-3c94-11e8-a12a-0a580a800010
    secretRef:
      name: ceph-kubernetes-dynamic-user-bc6d0f82-3c94-11e8-a12a-0a580a800010-secret
      namespace: jhou
    user: kubernetes-dynamic-user-bc6d0f82-3c94-11e8-a12a-0a580a800010
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: cephfsc
    namespace: jhou
    resourceVersion: "1474926"
    uid: bc5a83e5-3c94-11e8-a6af-0050569f41c6
  persistentVolumeReclaimPolicy: Delete
  storageClassName: cephfs
status: {}


PVC Dump:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"18d7c220-3c8d-11e8-a12a-0a580a800010","leaseDurationSeconds":15,"acquireTime":"2018-04-10T07:56:54Z","renewTime":"2018-04-10T07:56:56Z","leaderTransitions":0}'
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-class: cephfs
    volume.beta.kubernetes.io/storage-provisioner: ceph.com/cephfs
  creationTimestamp: null
  finalizers:
  - kubernetes.io/pvc-protection
  name: cephfsc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  volumeName: pvc-bc5a83e5-3c94-11e8-a6af-0050569f41c6
status: {}


StorageClass Dump (if StorageClass used by PV/PVC):
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: null
  name: cephfs
parameters:
  adminId: admin
  adminSecretName: cephrbd-secret
  adminSecretNamespace: default
  monitors: xxx:6789
provisioner: ceph.com/cephfs
reclaimPolicy: Delete
volumeBindingMode: Immediate


Additional info:
# ceph auth ls
...
client.kubernetes-dynamic-user-bc6d0f82-3c94-11e8-a12a-0a580a800010
        key: AQBHbsxaitnwBBAA/ST6a9SU2bm8MTr9HNvTuw==
        caps: [mds] allow r,allow rw path=/volumes/kubernetes/kubernetes-dynamic-pvc-bc6d0ed5-3c94-11e8-a12a-0a580a800010
        caps: [mon] allow r
        caps: [osd] allow rw pool=cephfs_data namespace=fsvolumens_kubernetes-dynamic-pvc-bc6d0ed5-3c94-11e8-a12a-0a580a800010

Comment 4 hchen 2018-04-11 15:28:51 UTC
kernel and rhel info:

[root@ocp310 ~]# uname -a
Linux ocp310.node2.vsphere.local 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 13 10:46:25 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@ocp310 ~]# more /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.4 (Maipo)

Comment 5 hchen 2018-04-11 15:45:56 UTC
[root@ceph-mon ~]# ceph -v
ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)

Comment 9 Jianwei Hou 2018-04-16 07:05:50 UTC
Thank you @hchen.

With image "cephfs-provisioner:v0.0.2-1"  and arg "-disable-ceph-namespace=true", this is fixed.

```
apiVersion: v1
kind: DeploymentConfig
metadata:
  annotations:
    description: Defines how to deploy the cephfs provisioner pod.
  creationTimestamp: null
  generation: 1
  labels:
    cephfs: cephfs-dc
  name: cephfs-provisioner-dc
spec:
  replicas: 1
  selector:
    cephfs: cephfs-provisioner
  strategy:
    activeDeadlineSeconds: 21600
    recreateParams:
      timeoutSeconds: 600
    resources: {}
    type: Recreate
  template:
    metadata:
      creationTimestamp: null
      labels:
        cephfs: cephfs-provisioner
      name: cephfs-provisioner
    spec:
      containers:
      - args:
        - -id=cephfs-provisioner-1
        - -disable-ceph-namespace=true
        env:
        - name: PROVISIONER_NAME
          value: ceph.com/cephfs
        image: openshift3/cephfs-provisioner:v0.0.2-1
        imagePullPolicy: IfNotPresent
        name: cephfs-provisioner
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: cephfs-provisioner
      serviceAccountName: cephfs-provisioner
      terminationGracePeriodSeconds: 30
  test: false
  triggers:
  - type: ConfigChange
status:
  availableReplicas: 0
  latestVersion: 0
  observedGeneration: 0
  replicas: 0
  unavailableReplicas: 0
  updatedReplicas: 0
```

Comment 12 Jianwei Hou 2018-05-17 05:17:55 UTC
I have verfied that the cephfs-provisioner:v0.0.2-2 fixes the problem.

Comment 14 errata-xmlrpc 2018-07-30 19:12:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.