Description of problem: According to the documentation found here: https://docs.openshift.com/enterprise/3.1/install_config/persistent_storage/persistent_storage_aws.html#volume-format-aws AWS Volumes can be left unformatted and when they are used, the FS will be checked. If the the FS does not exists, the PV will format it before mounting it. By doing this, it eliminates quite a bit of up front worked needed to prepare the EBS PV's. We have found that specifying the PV type as "xfs" will result in the pv definition, the the pod will never become ready. We have noticed that the volume is never formatted. Here is a piece of the pv spec: --------------------------------- spec: accessModes: - ReadWriteOnce awsElasticBlockStore: fsType: xfs volumeID: vol-486631ea capacity: storage: 1Gi --------------------------------- I have tried to debug, and it isn't clear what is happening with the mount (we upped the debug level, but the debug did not show the output of trying to mount and format if needed). Version-Release number of selected component (if applicable): atomic-openshift-3.1.1.6-5 How reproducible: This is very reporducible Steps to Reproduce: 1. Create an EBS volume 2. Use that EBS volume to create a PV in openshift. Here is an example spec we use: --------------------------------- apiVersion: v1 kind: PersistentVolume metadata: name: <pv name> labels: type: ebs spec: capacity: storage: <size in GB>Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain awsElasticBlockStore: volumeID: <EBS Volume ID> fsType: xfs --------------------------------- 3. Create a persistent pod using this volume Note: We are not formatting this drive at all before it is used. We expect openshift to do this. Actual results: Pod stays in bad state and does not ever start. Upon investigation it is found that the pod can't mount the volume. Expected results: Pod would format the volume if the volume is not formatted. This happens with ext4
Is mkfs.xfs installed on the node ?
Yes. The xfs until she are installed.
Reproduced with atomic-openshift-node-3.1.1.6-5.git.35.0742c54.el7aos.x86_64 openshift-node *does* call mkfs.xfs, but with wrong options. From strace: execve("/usr/sbin/mkfs.xfs", ["mkfs.xfs", "-E", "lazy_itable_init=0,lazy_journal_init=0", "-F", "/dev/xvdf"] mkfs.xfs: invalid option -- 'E' This has been fixed upstream: https://github.com/kubernetes/kubernetes/commit/a31d23ea0e1839d14a4ac112d962735480276793 and is already fixed in 3.2.
Matt - Since Jan said this is resolved in 3.2 already, could we have either you or Stefanie verify that you're not seeing the same issue for online? Also, do you want this backported for 3.1? Thanks!
Not backporting to 3.1. Moving to on_qa.
It is passed on openshift v3.2.0.16 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 Pod can running when using xfs or ext3 filesystem [root@ip-172-18-11-13 ~]# oc get pods awspdxfs -o yaml apiVersion: v1 kind: Pod metadata: annotations: openshift.io/scc: privileged creationTimestamp: 2016-04-25T03:26:12Z name: awspdxfs namespace: default resourceVersion: "40455" selfLink: /api/v1/namespaces/default/pods/awspdxfs uid: 75a74583-0a95-11e6-a33b-0ef0cbb2352b spec: containers: - image: aosqe/hello-openshift imagePullPolicy: Always name: web ports: - containerPort: 80 name: web protocol: TCP resources: {} securityContext: privileged: false terminationMessagePath: /dev/termination-log volumeMounts: - mountPath: /usr/share/nginx/html name: html-volume - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-ru8pg readOnly: true dnsPolicy: ClusterFirst host: ip-172-18-11-12.ec2.internal imagePullSecrets: - name: default-dockercfg-0kzxa nodeName: ip-172-18-11-12.ec2.internal restartPolicy: Always securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 volumes: - awsElasticBlockStore: fsType: xfs volumeID: aws://us-east-1d/vol-c47f5466 name: html-volume - name: default-token-ru8pg secret: secretName: default-token-ru8pg status: conditions: - lastProbeTime: null lastTransitionTime: 2016-04-25T03:26:19Z status: "True" type: Ready containerStatuses: - containerID: docker://e9d9e2bce12d1b37eab9da434d4c6d721255232d41250a0e7c866809abed5b09 image: aosqe/hello-openshift imageID: docker://cddcd4ab363acd31256ed7880d4b669fa45227e49eec41429f80a4f252dfb0da lastState: {} name: web ready: true restartCount: 0 state: running: startedAt: 2016-04-25T03:26:19Z hostIP: 172.18.11.12 phase: Running podIP: 10.2.1.2 startTime: 2016-04-25T03:26:12Z
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064