Copied from https://bugzilla.redhat.com/show_bug.cgi?id=1599217#c6 Looking into the logs, I can see OpenShift indeed initiated attach, but it timed out waiting for 10.70.46.1 and 10.70.46.75: Jul 10 15:56:32 dhcp46-175.lab.eng.blr.redhat.com atomic-openshift-node[2453]: I0710 15:56:32.454227 2453 iscsi_util.go:314] iscsi: dev /dev/disk/by-path/ip-10.70.46.1:3260-iscsi-iqn.2016-12.org.gluster-block:d2a42cc7-6a07-47e0-9b96-c25706d2fad2-lun-0 err Could not attach disk: Timeout after 10s Jul 10 15:56:42 dhcp46-175.lab.eng.blr.redhat.com atomic-openshift-node[2453]: I0710 15:56:42.217983 2453 iscsi_util.go:314] iscsi: dev /dev/disk/by-path/ip-10.70.46.75:3260-iscsi-iqn.2016-12.org.gluster-block:d2a42cc7-6a07-47e0-9b96-c25706d2fad2-lun-0 err Could not attach disk: Timeout after 10s Only 10.70.46.175 succeeds: Jul 10 15:56:43 dhcp46-175.lab.eng.blr.redhat.com atomic-openshift-node[2453]: I0710 15:56:43.375884 2453 iscsi_util.go:318] iscsi: dev /dev/disk/by-path/ip-10.70.46.175:3260-iscsi-iqn.2016-12.org.gluster-block:d2a42cc7-6a07-47e0-9b96-c25706d2fad2-lun-0 added to devicepath Since only the *last* one succeeded, OpenShift quickly checked that there is no /sys/block/dm-* that has /sys/block/dm-X/slaves/sds (i.e. considers the path as not part of multipath) and mounts it. There are several issues with this approach: 1. iscsi target or initiator is slow to attach the volume (that's intended, it's a stress test, right?) 2. OpenShift does not wait a while for multipath to evaluate a device. 3. OpenShift has no configurable parameter for attach timeout, 10s is hardcoded.
This is likely dup of #1597320
Upstream issue: https://github.com/kubernetes/kubernetes/issues/60894
One hotfix I can do relatively quickly: OpenShift can check several times (for 10s?) if a device is part of multipath before mounting single path. This is not proper solution to the problem, but it will remove the "blocking" part of this bug. It will slow down iSCSI setup to customers that run multiple portals for the same volume, but don't use multipath. Is it even valid setup? [I know CNS does not use this setup, but some other iSCSI user might.]
> This is likely dup of #1597320 Sorry, it's a different issue.
Who's looking at a retry logic on the CSI driver, if needed?