Bug 2062907

Summary: in-tree vsphere storage tests failing
Product: OpenShift Container Platform Reporter: rvanderp
Component: StorageAssignee: aos-storage-staff <aos-storage-staff>
Storage sub component: Operators QA Contact: Wei Duan <wduan>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: aos-bugs, jsafrane
Version: 4.11   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-11 11:58:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description rvanderp 2022-03-10 21:26:11 UTC
Description of problem:
4.11 e2e techpreview jobs are failing while waiting for a volume test pod to start.  It appears to be failing to schedule due to an issue with the volumeMigrationService.

~~~
2022-03-10T19:57:07.142687500Z E0310 19:57:07.142557       1 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/csi/csi.vsphere.vmware.com^[WorkloadDatastore] 706cfa61-7a2b-d2dd-d60e-06b6d880c5b7/e2e-vmdk-1646942212661214787.vmdk podName: nodeName:}" failed. No retries permitted until 2022-03-10 19:57:11.142534123 +0000 UTC m=+1713.493843996 (durationBeforeRetry 4s). Error: AttachVolume.Attach failed for volume "vsphere-tczvz" (UniqueName: "kubernetes.io/csi/csi.vsphere.vmware.com^[WorkloadDatastore] 706cfa61-7a2b-d2dd-d60e-06b6d880c5b7/e2e-vmdk-1646942212661214787.vmdk") from node "ci-op-w6gg2jjm-60285-kmg7x-worker-dpp7l" : rpc error: code = Internal desc = failed to get VolumeID from volumeMigrationService for volumePath: "[WorkloadDatastore] 706cfa61-7a2b-d2dd-d60e-06b6d880c5b7/e2e-vmdk-1646942212661214787.vmdk"


Version-Release number of selected component (if applicable):
- 4.11 tech preview e2e parallel tests

How reproducible:
consistently since the 8th of March

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Master Log:

Node Log (of failed PODs):
see example job for associated logs

PV Dump:
N/A
PVC Dump:
N/A

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
example job: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-vsphere-techpreview/1501996425003667456

appears to have started failing on March 8: https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-vsphere-techpreview

Comment 1 rvanderp 2022-03-10 21:32:15 UTC
This appears to be occurring now as the in-tree tests are re-enabled: https://github.com/openshift/origin/pull/26797

Comment 2 Jan Safranek 2022-03-11 11:58:29 UTC

*** This bug has been marked as a duplicate of bug 2029835 ***