Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1453078 - Master service restart can cause volume/pv/pvc un-sync
Master service restart can cause volume/pv/pvc un-sync
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.6.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Matthew Wong
Liang Xia
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-05-22 01:58 EDT by Liang Xia
Modified: 2017-08-16 15 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 01:25:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
master logs (2.28 MB, application/x-gzip)
2017-05-22 02:23 EDT, Liang Xia
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 05:02:50 EDT

  None (edit)
Description Liang Xia 2017-05-22 01:58:54 EDT
Description of problem:
Some volume provisioning failed with error "the resource XXX already exists" after master restart

Version-Release number of selected component (if applicable):
openshift v3.6.74
kubernetes v1.6.1+5115d708d7
etcd 3.1.0

How reproducible:
Very often, but not always

Steps to Reproduce:
1.Login to server and create a project.
$ oc login $server
$ oc new-project project-name

2.Keep creating pvc.
$ for i in {1..100} ; do
  cp pvc.json pvc-$i.json
  sed -i "s/#NAME#/pvc-$RANDOM/" pvc-$i.json
  oc create -f pvc-$i.json
  rm -f pvc-$i.json
done

3.Restart master service.
$ systemctl restart atomic-openshift-master

4.Wait for several minutes.

5.Check pvc status.
$ oc describe pvc

Actual results:
$ oc describe pvc
Name:        pvc-959
Namespace:    lxiap
StorageClass:    
Status:        Pending
Volume:        
Labels:        name=dynamic-pvc
Annotations:    volume.alpha.kubernetes.io/storage-class=foo
        volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/gce-pd
Capacity:    
Access Modes:    
Events:
  FirstSeen    LastSeen    Count    From                SubObjectPath    Type        Reason            Message
  ---------    --------    -----    ----                -------------    --------    ------            -------
  8m        8m        1    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists
  8m        8m        2    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists
  7m        6m        5    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists
  6m        6m        2    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists
  6m        6m        1    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists
  5m        5m        1    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists
  5m        11s        23    persistent-volume-controller            Warning        ProvisioningFailed    Failed to provision volume with StorageClass "": googleapi: Error 409: The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/kubernetes-dynamic-pvc-dd23a7dd-3e9d-11e7-8885-42010af00002' already exists, alreadyExists


Expected results:
The pvc should reuse existing volume, see https://github.com/kubernetes/kubernetes/pull/38702

Additional info:
$ cat pvc.json 
{
  "kind": "PersistentVolumeClaim",
  "apiVersion": "v1",
  "metadata": {
    "name": "#NAME#",
    "annotations": {
        "volume.alpha.kubernetes.io/storage-class": "foo"
    },
    "labels": {
        "name": "dynamic-pvc"
    }
  },
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "resources": {
      "requests": {
        "storage": "1Gi"
      }
    }
  }
}
Comment 1 Liang Xia 2017-05-22 02:23 EDT
Created attachment 1280907 [details]
master logs

journalctl -u atomic-openshift-master > atomic-openshift-master.log
Comment 3 Matthew Wong 2017-05-23 18:08:18 EDT
The bug is that we check for the alreadyExists error too late; we should check it right after gce.service.Disks.Insert(gce.projectID, zone, diskToCreate).Do(), not gce.waitForZoneOp(createOp, zone).
Comment 4 Matthew Wong 2017-05-24 14:46:18 EDT
https://github.com/openshift/origin/pull/14329
Comment 6 Liang Xia 2017-06-05 04:30:36 EDT
Verified the fix works on OCP v3.6.86

Used the same steps as in #comment 0, all pvc can become bound.
After pvcs are deleted, pv and volumes are deleted automatically.
Comment 8 errata-xmlrpc 2017-08-10 01:25:32 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.