With CSI migration enabled, the vSphere CSI driver occasionally reports errors like this: {"level":"error","time":"2021-12-02T16:45:30.085848565Z","caller":"vanilla/controller.go:1026","msg":"failed to get VolumeID from volumeMigrationService for volumePath: \"[WorkloadDatastore] 5137595f-7ce3-e95a-5c03-06d835dea807/ci-op-35gg80lm-36309-q-pvc-52ea5e0c-2029-4e30-a3bf-c722e07ac087.vmdk\"","TraceId":"ef466900-c0a5-4298-be62-572993efd6a2","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume.func1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/pkg/csi/service/vanilla/controller.go:1026\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/pkg/csi/service/vanilla/controller.go:1050\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler.func1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:5632\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerPublishVolume\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:120\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:86\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:5634\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/google.golang.org/grpc/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/google.golang.org/grpc/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/google.golang.org/grpc/server.go:722"} Example: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_vmware-vsphere-csi-driver-operator/49/pull-ci-openshift-vmware-vsphere-csi-driver-operator-master-e2e-vsphere-csi-migration/1466408251087130624/artifacts/e2e-vsphere-csi-migration/gather-extra/artifacts/pods/openshift-cluster-csi-drivers_vmware-vsphere-csi-driver-controller-5d8795b589-lxs2x_csi-driver.log
Decrypted stack trace: sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume.func1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/pkg/csi/service/vanilla/controller.go:1026 sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/pkg/csi/service/vanilla/controller.go:1050 github.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler.func1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:5632 github.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerPublishVolume /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:120 github.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/serialvolume/serial_volume_locker.go:86 github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 github.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/specvalidator/spec_validator.go:178 github.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/specvalidator/spec_validator.go:218 github.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware/specvalidator/spec_validator.go:177 github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 github.com/rexray/gocsi.(*StoragePlugin).injectContext /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/middleware.go:231 github.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:99 github.com/rexray/gocsi/utils.ChainUnaryServer.func2 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/rexray/gocsi/utils/utils_middleware.go:106 github.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:5634 google.golang.org/grpc.(*Server).processUnaryRPC /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/google.golang.org/grpc/server.go:1024 google.golang.org/grpc.(*Server).handleStream /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/google.golang.org/grpc/server.go:1313 google.golang.org/grpc.(*Server).serveStreams.func1.1 /go/src/github.com/kubernetes-sigs/vsphere-csi-driver/vendor/google.golang.org/grpc/server.go:722~ $
Tested with today's CI version, 4.12.0-0.ci-2022-10-12-173207 + CSI migration enabled, I don't any of these errors: > failed to get VolumeID from volumeMigrationService for volumePath > CnsFault error: CNS: Failed to attach disk because missing SCSI controller I saw this error when TechPreviewNoUpgrade was being applied (i.e. nodes were drained and restarted): Warning FailedAttachVolume 8m44s (x7 over 9m19s) attachdetach-controller AttachVolume.Attach failed for volume "csi.vsphere.vmware.com-[WorkloadDatastore] b4bc4663-de68-0b0e-6d89-0603566a4811/1.vmdk" : rpc error: code = Internal desc = failed to set keepAfterDeleteVm control flag for VolumeID "5b87ca09-3f72-4475-883e-995624b8710e" Normal SuccessfulAttachVolume 8m11s attachdetach-controller AttachVolume.Attach succeeded for volume "csi.vsphere.vmware.com-[WorkloadDatastore] b4bc4663-de68-0b0e-6d89-0603566a4811/1.vmdk" But it healed in ~30 seconds, I don't think it's worth debugging. I tested both in-line in-tree volumes and in-tree PVs, both running 40 Pods with 40 volumes each. I think it got fixed by a recent CSI driver rebase.
Change the Target Release to 4.12.0