Bug 1398417 - Data from persistent volumes is wiped after a node service restart
Summary: Data from persistent volumes is wiped after a node service restart
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.3.1
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Jan Safranek
QA Contact: Jianwei Hou
URL:
Whiteboard:
: 1427536 (view as bug list)
Depends On:
Blocks: 1267746
TreeView+ depends on / blocked
 
Reported: 2016-11-24 18:40 UTC by Josep 'Pep' Turro Mauri
Modified: 2020-05-14 15:25 UTC (History)
15 users (show)

Fixed In Version: atomic-openshift-3.3.1.7-1.git.0.0988966.el7
Doc Type: Bug Fix
Doc Text:
Cause: OpenShift node daemon did not recover properly from restart and it lost information about attached and mounted volumes. Consequence: In rare cases, the node daemon deleted all data on a mounted volume, thinking that it has been already unmounted while it was only missing it node's cache. Fix: Recover node caches after restart. Result: No data loss on mounted volumes.
Clone Of:
Environment:
Last Closed: 2016-12-07 20:59:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2915 0 normal SHIPPED_LIVE Important: atomic-openshift security and bug fix update 2016-12-08 01:58:10 UTC

Description Josep 'Pep' Turro Mauri 2016-11-24 18:40:40 UTC
Description of problem:

An OpenShift node suffered from OOM that killed the node process. This resulted in persistent volumes that were in use at that time by pods running on that node to be wiped.

Version-Release number of selected component (if applicable):

atomic-openshift-3.3.1.5-1.git.0.62700af.el7

How reproducible:
Observed at least twice on two separate AWS-based clusters.

Steps to Reproduce:

1. deploy atomic-openshift-3.3.1.5-1.git.0.62700af.el7.x86_64 on some AWS nodes (two used for reproduction)

2. create a PVC + PV (using e.g. alpha provisioning):
$ oc create -f - <<EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
  annotations:
    "volume.alpha.kubernetes.io/storage-class": "fast"
    "volume.beta.kubernetes.io/storage-class": "fast"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1500Mi
EOF

3. create a pod that writes to the volume:

$ oc create -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: testpod
  labels: 
    name: test
spec: 
  restartPolicy: Never
  containers: 
    - resources:
        limits :
          cpu: 0.5
      image: gcr.io/google_containers/busybox
      command:
        - "/bin/sh"
        - "-c"
        - "while true; do date; date >>/mnt/test/date; sleep 1; done"
      name: busybox
      volumeMounts:
        - name: vol
          mountPath: /mnt/test
  volumes:
      - name: vol
        persistentVolumeClaim:
          claimName: myclaim
EOF


Now the magic:

4. find out on which node the pod runs, e.g.
$ oc describe pod testpod
... "Successfully assigned testpod to ip-172-18-8-57.ec2.internal"

5. on the node, look where the volume is mounted and check that "date" is there:
$ mount 
...
/dev/xvdba on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d type ext4 ...
$ ls /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d
date


6. on the node:
$ service atomic-openshift-node stop

6. on the master, delete the pod
$ oc delete pod testpod

7. on the node:
$ service atomic-openshift-node start

8. wait for a minute or so...

9. on the node, check that the volume is still mounted *and* the directory where it is mounted is empty:

$ mount 
...
/dev/xvdba on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d type ext4 ...

$ ls /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-1c6cc08d


Actual results:
-> something removed file date from the volume!

Expected results:
-> data still there

Comment 1 Jan Safranek 2016-11-24 18:46:07 UTC
We need to backport https://github.com/kubernetes/kubernetes/pull/27970 and https://github.com/kubernetes/kubernetes/pull/36840 into 3.3

Comment 4 hchen 2016-11-28 18:59:27 UTC
I am wondering if there is a chance the data file still lived on the volume but the volume was detached after step 9. Can you get mount output? If the EBS volume is detached, can you re-attach it and check if the file is still there?

Comment 5 Jianwei Hou 2016-11-29 07:37:46 UTC
I tried to reproduce this on GCE following the above steps. After step 8, pod was deleted, when you run "mount", the directories were still there but were no longer accessible.

```
-bash-4.2# mount|grep pvc-5f41ae8d-b5f9-11e6-8bac-42010af00013
/dev/sdc on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-5f41ae8d-b5f9-11e6-8bac-42010af00013
 type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sdc on /var/lib/origin/openshift.local.volumes/pods/6c21aa22-b5f9-11e6-8bac-42010af00013/volumes/kubernetes.io~gce-pd/pvc-5f41ae8d-b5f9-11e6-8
bac-42010af00013 type ext4 (rw,relatime,seclabel,data=ordered)

-bash-4.2# ls /var/lib/origin/openshift.local.volumes/pods/6c21aa22-b5f9-11e6-8bac-42010af00013/volumes/kubernetes.io~gce-pd/pvc-5f41ae8d-b5f9-11e6
-8bac-42010af00013/
ls: reading directory /var/lib/origin/openshift.local.volumes/pods/6c21aa22-b5f9-11e6-8bac-42010af00013/volumes/kubernetes.io~gce-pd/pvc-5f41ae8d-b
5f9-11e6-8bac-42010af00013/: Input/output error
```

From gce console, I looked up for the volume and found the volume was 'available'. Since PV and PVC were still there, I recreated the pod again to use the volume. The data in the volume were **NOT** wiped.

So in the description, after step 9, the system was left with an unclean mount, the volume was already detached from the node.

Comment 7 Jianwei Hou 2016-11-29 11:00:18 UTC
Reproduced it on aws with 'openshift v3.3.1.5'.

Yes the volume was still attached and mounted, the data in it were lost.

Comment 11 Hemant Kumar 2016-11-29 20:47:35 UTC
Just wanted to reiterate what hchen posted. I started a cluster with openshift (v3.3.1.5) on AWS using Flexy launch and created a pvc with following yaml:

apiVersion: v1
metadata:
  name: dyn-claim
  annotations:
    volume.alpha.kubernetes.io/storage-class: "bar" 
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi


and the pod looked like:
apiVersion: v1
kind: Pod
metadata:
  name: testpod
  labels:
    name: test
spec:
  restartPolicy: Never
  containers:
    - resources:
        limits :
          cpu: 0.5
      image: gcr.io/google_containers/busybox
      command:
        - "/bin/sh"
        - "-c"
        - "while true; do date; date >>/mnt/test/date; sleep 1; done"
      name: busybox
      volumeMounts:
        - name: vol
          mountPath: /mnt/test
  volumes:
      - name: vol
        persistentVolumeClaim:
          claimName: dyn-claim

And I first created the pv/pvc. Started the pod.  After that I stopped atomic-openshift-node service on node where pod was scheduled and then deleted the pod (Pod deletion was stuck in Terminating state).

After waiting a while, I started back atomic-openshift-node service and volume mount was gone from the node. I checked the volume by mounting it in another pod  but data was still there.

Comment 13 Eric Paris 2016-11-29 23:14:39 UTC
The key to reproducing under 3.3.1.5 is to make sure the node comes up with
```
  enable-controller-attach-detach:
  - 'false'
```

Which would be true for upgraded clusters, but would not be the default for new clusters...

Comment 14 Eric Paris 2016-11-29 23:18:43 UTC
Also should point out (not surprising given the attach controller thing) this does not reproduce if you use a hostDir PV.

Comment 15 Jan Safranek 2016-11-30 13:41:32 UTC
(In reply to Eric Paris from comment #13)
> The key to reproducing under 3.3.1.5 is to make sure the node comes up with
>
>   enable-controller-attach-detach:
>   - 'false'


I can reproduce it with enable-controller-attach-detach: true, the key is probably to wait a lot before starting the node again, i.e. between steps 6 and 7. The pod should disappear from API server, "Terminating" is not enough.

"oc delete pod testpod  --grace-period=0" speeds things up.

Comment 21 Bradley Childs 2016-12-01 01:47:40 UTC
Fix merged into OSE 3.3:  https://github.com/openshift/ose/pull/480

The 3.3 branch of Origin has test failures, so we haven't merged there yet (but there is a PR here: https://github.com/openshift/origin/pull/12024).

Additionally, we've validated that the bug doesn't exist in:

OSE 3.1.x
OSE 3.2.x
OSE 3.3.x (fixed)
OSE 3.4


QE please let me know when you've created a Test Case for this so we can validate.

Comment 28 Jianwei Hou 2016-12-05 06:41:34 UTC
Verified on 
openshift v3.3.1.7
kubernetes v1.3.0+52492b4
etcd 2.3.0+git

Confirmed this data deletion issue is fixed. Also tested this does not exist in 3.1, 3.2 and 3.4

Comment 35 errata-xmlrpc 2016-12-07 20:59:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:2915

Comment 36 Eric Paris 2017-03-02 16:47:59 UTC
*** Bug 1427536 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.