Bug 1710168

Summary:	[e2e] Attach followed by a detach of same volume is stuck waiting for volume attachment on old device path
Product:	OpenShift Container Platform	Reporter:	Hemant Kumar <hekumar>
Component:	Storage	Assignee:	Hemant Kumar <hekumar>
Status:	CLOSED ERRATA	QA Contact:	Liang Xia <lxia>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	4.1.0	CC:	aos-bugs, aos-storage-staff
Target Milestone:	---
Target Release:	4.1.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-08-15 14:24:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Hemant Kumar 2019-05-15 03:16:19 UTC

e2e failure : https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/22819/pull-ci-openshift-origin-master-e2e-aws/8711/

The timeline events from my debugging so far:

# detach from old node(and old pod) is started
I0514 19:30:17.938746       1 reconciler.go:230] attacherDetacher.DetachVolume started for volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c" (UniqueName: "kubernetes.io/aws-ebs/aws://us-east-1b/vol-04f136c2462a930a1") on node "ip-10-0-144-99.ec2.internal" 

# Kubelet sees a new pod that uses same volume 
May 14 19:30:18 ip-10-0-144-99 hyperkube[998]: I0514 19:30:18.215594     998 status_manager.go:527] Status for pod "volume-client_e2e-tests-volumes-ffhmv(b49b3cca-767e-11e9-9339-129c3af80d5c)" updated successfully: (1, {Phase:Pending Conditions:[{Type:Initialized Status:True LastProb

# detach from old node(and old pod) succeeds
0514 19:30:25.001324       1 aws.go:2244] waitForAttachmentStatus returned non-nil attachment with state=detached: {
  AttachTime: 2019-05-14 19:29:50 +0000 UTC,
  DeleteOnTermination: false,
  Device: "/dev/xvdbl",
  InstanceId: "i-018e05c99167cc3f9",
  State: "detaching",
  VolumeId: "vol-04f136c2462a930a1"
}
I0514 19:30:25.326374       1 operation_generator.go:423] DetachVolume.Detach succeeded for volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c" (UniqueName: "kubernetes.io/aws-ebs/aws://us-east-1b/vol-04f136c2462a930a1") on node "ip-10-0-144-99.ec2.internal"

# controller attempt to start new attach for new pod (but on same node)
I0514 19:30:25.364349       1 reconciler.go:289] attacherDetacher.AttachVolume started for volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c" (UniqueName: "kubernetes.io/aws-ebs/aws://us-east-1b/vol-04f136c2462a930a1") from node "ip-10-0-144-99.ec2.internal" 


# kubelet verifycontrollerattached volume succeeds with old path
May 14 19:30:34 ip-10-0-144-99 hyperkube[998]: I0514 19:30:34.388088     998 operation_generator.go:1199] Controller attach succeeded for volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c" (UniqueName: "kubernetes.io/aws-ebs/aws://us-east-1b/vol-04f136c2462a930a1") pod "volume-client" (UID: "b49b3cca-767e-11e9-9339-129c3af80d5c") device path: "/dev/xvdbl"


# in controller attach actual succeeds for second pod
I0514 19:30:59.804475       1 operation_generator.go:339] AttachVolume.Attach succeeded for volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c" (UniqueName: "kubernetes.io/aws-ebs/aws://us-east-1b/vol-04f136c2462a930a1") from node "ip-10-0-144-99.ec2.internal"


second pod is now stuck waiting for device to get attached on /dev/xvdbl whereas new attach call succeeded on different path.

There are couple of other "problematic" things going on:

1. The usual eventual consistency of "DescribeVolume" API calls strikes again:

See detach succeeds at 14.30.25:

I0514 19:30:25.326374       1 operation_generator.go:423] DetachVolume.Detach succeeded for volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c" (UniqueName: "kubernetes.io/aws-ebs/aws://us-east-1b/vol-04f136c2462a930a1") on node "ip-10-0-144-99.ec2.internal" 

and yet:

I0514 19:30:25.510323       1 aws.go:1726] volume ID: vol-04f136c2462a930a1: device mappings from EC2 Instance: map[a:vol-00a246b72606dab0b ca:vol-049251631ad7e7638]
I0514 19:30:25.510335       1 aws.go:1738] volume ID: vol-04f136c2462a930a1: Full device mappings: map[a:vol-00a246b72606dab0b ca:vol-049251631ad7e7638]
I0514 19:30:25.510356       1 aws.go:1779] Assigned mount device bv -> volume vol-04f136c2462a930a1
E0514 19:30:25.732642       1 aws.go:2133] "Error attaching EBS volume \"vol-04f136c2462a930a1\"" to instance "i-018e05c99167cc3f9" since volume is currently attached to "i-018e05c99167cc3f9"

so second DescribeVolume API call is actually lying. 


2. Traditionally we have disabled the reconciler resync mechanism in Openshift - https://github.com/openshift/openshift-ansible/pull/5387 . It throws one more variable in the mix:

I0514 19:30:56.098777       1 operation_generator.go:268] VerifyVolumesAreAttached.BulkVerifyVolumes failed for node "ip-10-0-144-99.ec2.internal" and volume "pvc-a007686c-767e-11e9-9339-129c3af80d5c"


At 19.30.56, that verification mechansim in controller has failed verification of volume attachment to the node and hence it will remove the volume from ASOW. I will open a PR to controller-manager to disable this again. The volume verification mechanism is known to race with other operations in reconciler.

Comment 1 Hemant Kumar 2019-05-15 13:54:49 UTC

This is not a blocker. We will fix this in 4.1.z

Comment 2 Hemant Kumar 2019-05-15 14:11:16 UTC

okay, I think I know why Kubelet saw old device Path. The problem is - when attach for second pod failed with "Error attaching EBS volume \"vol-04f136c2462a930a1\"" to instance "i-018e05c99167cc3f9" since volume is currently attached to "i-018e05c99167cc3f9, this caused an dangling volume error from AWS. The dangling volume error is supposed to auto-correct any volumes which are attached to a node that it shouldn't be. But in this case - the dangling volume error was triggered by the fact that, volume was not actually attached to the node but because `describeVolume` call for 2nd attachment returned stale data (because of AWS eventual consistency). 

So, dangling volume exception caused volume to be added to actual state of world (so as it can be detached later), but this caused volume to be reported to node's Status field. And that caused second pod to assume volume as attached to invalid device Path and hence 2nd pod was stuck for 10 minutes waiting for volume to attach on a device path (/dev/xvdbl) which does not exist.

Comment 3 Hemant Kumar 2019-05-15 18:56:47 UTC

Logged https://github.com/kubernetes/kubernetes/issues/77949 in upstream to track it and discuss a fix.

Comment 4 Hemant Kumar 2019-05-31 19:57:37 UTC

Opened a PR for fixing this - https://github.com/kubernetes/kubernetes/pull/78595

Comment 5 Hemant Kumar 2019-08-07 17:28:12 UTC

Fix for openshift. Already merged - https://github.com/openshift/origin/pull/23013

Comment 7 Liang Xia 2019-08-08 03:30:39 UTC

@Hemant Could you direct QE how to reproduce and/or verify this bug ? Thanks.

Comment 8 Liang Xia 2019-08-12 02:57:01 UTC

Checked the builds in https://openshift-gce-devel.appspot.com/builds/origin-ci-test/pr-logs/pull/22819/pull-ci-openshift-origin-master-e2e-aws/

All the builds after 8711 (the failed one which is reported in #comment 0),
8716, 8720, 8722, 8725 are passed.

Move bug to verified.

Comment 10 errata-xmlrpc 2019-08-15 14:24:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2417