Description of problem: An OpenShift node suffered from OOM that killed the node process. This resulted in persistent volumes that were in use at that time by pods running on that node to be wiped. Version-Release number of selected component (if applicable): atomic-openshift-3.3.1.5-1.git.0.62700af.el7 How reproducible: Observed at least twice on two separate AWS-based clusters. Steps to Reproduce: 1. deploy atomic-openshift-3.3.1.5-1.git.0.62700af.el7.x86_64 on some AWS nodes (two used for reproduction) 2. create a PVC + PV (using e.g. alpha provisioning): $ oc create -f - <<EOF kind: PersistentVolumeClaim apiVersion: v1 metadata: name: myclaim annotations: "volume.alpha.kubernetes.io/storage-class": "fast" "volume.beta.kubernetes.io/storage-class": "fast" spec: accessModes: - ReadWriteOnce resources: requests: storage: 1500Mi EOF 3. create a pod that writes to the volume: $ oc create -f - <<EOF apiVersion: v1 kind: Pod metadata: name: testpod labels: name: test spec: restartPolicy: Never containers: - resources: limits : cpu: 0.5 image: gcr.io/google_containers/busybox command: - "/bin/sh" - "-c" - "while true; do date; date >>/mnt/test/date; sleep 1; done" name: busybox volumeMounts: - name: vol mountPath: /mnt/test volumes: - name: vol persistentVolumeClaim: claimName: myclaim EOF Now the magic: 4. find out on which node the pod runs, e.g. $ oc describe pod testpod ... "Successfully assigned testpod to ip-172-18-8-57.ec2.internal" 5. on the node, look where the volume is mounted and check that "date" is there: $ mount ... /dev/xvdba on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d type ext4 ... $ ls /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d date 6. on the node: $ service atomic-openshift-node stop 6. on the master, delete the pod $ oc delete pod testpod 7. on the node: $ service atomic-openshift-node start 8. wait for a minute or so... 9. on the node, check that the volume is still mounted *and* the directory where it is mounted is empty: $ mount ... /dev/xvdba on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d type ext4 ... $ ls /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d Actual results: -> something removed file date from the volume! Expected results: -> data still there
We need to backport https://github.com/kubernetes/kubernetes/pull/27970 and https://github.com/kubernetes/kubernetes/pull/36840 into 3.3
I am wondering if there is a chance the data file still lived on the volume but the volume was detached after step 9. Can you get mount output? If the EBS volume is detached, can you re-attach it and check if the file is still there?
I tried to reproduce this on GCE following the above steps. After step 8, pod was deleted, when you run "mount", the directories were still there but were no longer accessible. ``` -bash-4.2# mount|grep pvc-5f41ae8d-b5f9-11e6-8bac-42010af00013 /dev/sdc on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-5f41ae8d-b5f9-11e6-8bac-42010af00013 type ext4 (rw,relatime,seclabel,data=ordered) /dev/sdc on /var/lib/origin/openshift.local.volumes/pods/6c21aa22-b5f9-11e6-8bac-42010af00013/volumes/kubernetes.io~gce-pd/pvc-5f41ae8d-b5f9-11e6-8 bac-42010af00013 type ext4 (rw,relatime,seclabel,data=ordered) -bash-4.2# ls /var/lib/origin/openshift.local.volumes/pods/6c21aa22-b5f9-11e6-8bac-42010af00013/volumes/kubernetes.io~gce-pd/pvc-5f41ae8d-b5f9-11e6 -8bac-42010af00013/ ls: reading directory /var/lib/origin/openshift.local.volumes/pods/6c21aa22-b5f9-11e6-8bac-42010af00013/volumes/kubernetes.io~gce-pd/pvc-5f41ae8d-b 5f9-11e6-8bac-42010af00013/: Input/output error ``` From gce console, I looked up for the volume and found the volume was 'available'. Since PV and PVC were still there, I recreated the pod again to use the volume. The data in the volume were **NOT** wiped. So in the description, after step 9, the system was left with an unclean mount, the volume was already detached from the node.
Reproduced it on aws with 'openshift v3.3.1.5'. Yes the volume was still attached and mounted, the data in it were lost.
Just wanted to reiterate what hchen posted. I started a cluster with openshift (v3.3.1.5) on AWS using Flexy launch and created a pvc with following yaml: apiVersion: v1 metadata: name: dyn-claim annotations: volume.alpha.kubernetes.io/storage-class: "bar" spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi and the pod looked like: apiVersion: v1 kind: Pod metadata: name: testpod labels: name: test spec: restartPolicy: Never containers: - resources: limits : cpu: 0.5 image: gcr.io/google_containers/busybox command: - "/bin/sh" - "-c" - "while true; do date; date >>/mnt/test/date; sleep 1; done" name: busybox volumeMounts: - name: vol mountPath: /mnt/test volumes: - name: vol persistentVolumeClaim: claimName: dyn-claim And I first created the pv/pvc. Started the pod. After that I stopped atomic-openshift-node service on node where pod was scheduled and then deleted the pod (Pod deletion was stuck in Terminating state). After waiting a while, I started back atomic-openshift-node service and volume mount was gone from the node. I checked the volume by mounting it in another pod but data was still there.
The key to reproducing under 3.3.1.5 is to make sure the node comes up with ``` enable-controller-attach-detach: - 'false' ``` Which would be true for upgraded clusters, but would not be the default for new clusters...
Also should point out (not surprising given the attach controller thing) this does not reproduce if you use a hostDir PV.
(In reply to Eric Paris from comment #13) > The key to reproducing under 3.3.1.5 is to make sure the node comes up with > > enable-controller-attach-detach: > - 'false' I can reproduce it with enable-controller-attach-detach: true, the key is probably to wait a lot before starting the node again, i.e. between steps 6 and 7. The pod should disappear from API server, "Terminating" is not enough. "oc delete pod testpod --grace-period=0" speeds things up.
Fix merged into OSE 3.3: https://github.com/openshift/ose/pull/480 The 3.3 branch of Origin has test failures, so we haven't merged there yet (but there is a PR here: https://github.com/openshift/origin/pull/12024). Additionally, we've validated that the bug doesn't exist in: OSE 3.1.x OSE 3.2.x OSE 3.3.x (fixed) OSE 3.4 QE please let me know when you've created a Test Case for this so we can validate.
Verified on openshift v3.3.1.7 kubernetes v1.3.0+52492b4 etcd 2.3.0+git Confirmed this data deletion issue is fixed. Also tested this does not exist in 3.1, 3.2 and 3.4
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:2915
*** Bug 1427536 has been marked as a duplicate of this bug. ***