Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1508107

Summary: [pro][pro-us-east-1] Error attaching EBS volume: VolumeInUse
Product: OpenShift Container Platform Reporter: Will Gordon <wgordon>
Component: StorageAssignee: Hemant Kumar <hekumar>
Status: CLOSED ERRATA QA Contact: Chao Yang <chaoyang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: aos-bugs, aos-storage-staff, lxia, rcwwilliams07
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-06 06:56:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Will Gordon 2017-10-31 20:21:22 UTC
Description of problem:
Error attaching EBS volume: VolumeInUse

Version-Release number of selected component (if applicable):
openshift v3.6.173.0.21
kubernetes v1.6.1+5115d708d7

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:
$ oc get pvc -o yaml 
apiVersion: v1 
items: 
- apiVersion: v1 
kind: PersistentVolumeClaim 
metadata: 
annotations: 
pv.kubernetes.io/bind-completed: "yes" 
pv.kubernetes.io/bound-by-controller: "yes" 
volume.beta.kubernetes.io/storage-class: ebs 
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs 
creationTimestamp: 2017-08-12T14:54:05Z 
name: mysql 
namespace: lunchroom 
resourceVersion: "71449379" 
selfLink: /api/v1/namespaces/lunchroom/persistentvolumeclaims/mysql 
uid: 16487201-7f6e-11e7-9104-125b034d2f46 
spec: 
accessModes: 
- ReadWriteOnce 
resources: 
requests: 
storage: 1Gi 
volumeName: pvc-16487201-7f6e-11e7-9104-125b034d2f46 
status: 
accessModes: 
- ReadWriteOnce 
capacity: 
storage: 1Gi 
phase: Bound 
kind: List 
metadata: {} 
resourceVersion: "" 
selfLink: ""

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Will Gordon 2017-10-31 20:25:25 UTC
oc describe pv pvc-16487201-7f6e-11e7-9104-125b034d2f46                                                  
Name:		pvc-16487201-7f6e-11e7-9104-125b034d2f46
Labels:		failure-domain.beta.kubernetes.io/region=us-east-1
		failure-domain.beta.kubernetes.io/zone=us-east-1c
Annotations:	kubernetes.io/createdby=aws-ebs-dynamic-provisioner
		pv.kubernetes.io/bound-by-controller=yes
		pv.kubernetes.io/provisioned-by=kubernetes.io/aws-ebs
		volume.beta.kubernetes.io/storage-class=ebs
StorageClass:	ebs
Status:		Bound
Claim:		lunchroom/mysql
Reclaim Policy:	Delete
Access Modes:	RWO
Capacity:	1Gi
Message:
Source:
    Type:	AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:	aws://us-east-1c/vol-0f296ccfc91c13906
    FSType:	ext4
    Partition:	0
    ReadOnly:	false
Events:		<none>

Comment 2 Hemant Kumar 2018-01-16 21:51:00 UTC
We have implemented a generic recovery mechanism in Openshift 3.9, which will detect volumes stuck on another instance (and if there is no pod that is actively using the volume on that instance) and detach them if necessary. 

One easy way to reproduce this problem is (before 3.9):

1. Create a standalone pod (no deployments, rc etc) with volumes.
2. Shutdown the node.
3. Now wait for the pod on the node to be deleted.
4. Once pod is deleted (spam kubectl get pods) but before controller-manager could detach the volume (there is minimum of 6 minute delay), restart the controller-manager.

5. Above action will cause volume information to be wiped from controller-manager
6. Now try to attach same PVC in another pod (may be scheduled on different node). The pod will stuck in "ContainerCreating" state in 3.7 but not on 3.9

There are few other ways to reproduce this error but this is perhaps easiest.

Comment 4 Liang Xia 2019-04-17 08:47:03 UTC
Unable to reproduce with below version, move bug to verified.

oc v3.9.77
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-10-24.ec2.internal:8443
openshift v3.9.77
kubernetes v1.9.1+a0ce1bc657

Comment 6 errata-xmlrpc 2019-06-06 06:56:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0788