Description of problem: Regarding code in /usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py # Check host_device only when transport is used, since otherwise it is # directly derived from properties. Only needed for unit tests # Loop until at least 1 path becomes available while ((self._get_transport() != "default" and not host_devices) or all(map(lambda x: not os.path.exists(x), host_devices))): Since in Nova kilo host_device initialized to None, and stays None in case /dev/disk/by-path wasn’t updated on the first try (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume.py#L842-L846), but in packstack code host_device modified to a list, then it won’t pass the os.path.exists() for the None initialized host_device param like in (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume.py#L446-L447) since os.path.exists crash for None values. One solution is to remove all the None values from the host_devices Version-Release number of selected component (if applicable): openstack-kilo Actual results: volume attach fails Expected results: volume attach passes and set to be 'in-use' Additional info:
Created attachment 1080290 [details] patch
Please provide logs and proper NVR versioning of the nova RPMs in use, i.e.: $ sudo rpm -qa | grep nova
Created attachment 1081770 [details] logs and confs logs and confs
# sudo rpm -qa | grep nova python-novaclient-2.23.0-1.el7ost.noarch openstack-nova-console-2015.1.0-18.el7ost.noarch openstack-nova-common-2015.1.0-18.el7ost.noarch openstack-nova-scheduler-2015.1.0-18.el7ost.noarch openstack-nova-conductor-2015.1.0-18.el7ost.noarch openstack-nova-api-2015.1.0-18.el7ost.noarch openstack-nova-cert-2015.1.0-18.el7ost.noarch openstack-nova-compute-2015.1.0-18.el7ost.noarch python-nova-2015.1.0-18.el7ost.noarch openstack-nova-novncproxy-2015.1.0-18.el7ost.noarch
(In reply to Lenny Verkhovsky from comment #0) > Description of problem: > > > Regarding code in > /usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py > > # Check host_device only when transport is used, since otherwise it > is > # directly derived from properties. Only needed for unit tests > # Loop until at least 1 path becomes available > while ((self._get_transport() != "default" and not host_devices) > or all(map(lambda x: not os.path.exists(x), host_devices))): > > > Since in Nova kilo host_device initialized to None, and stays None in case > /dev/disk/by-path wasn’t updated on the first try > (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume. > py#L842-L846), but in packstack code host_device modified to a list, then it > won’t pass the os.path.exists() for the None initialized host_device param > like in > (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume. > py#L446-L447) since os.path.exists crash for None values. > > One solution is to remove all the None values from the host_devices I might be missing something here but all I see in the logs is an incorrect rootwrap filter due to pci identifiers for the HBA being included in the paths : 21982 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher ProcessExecutionError: Unexpected error while running command. 21983 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher Command: sudo nova-rootwrap /etc/nova/rootwrap.conf ln --symbolic --force /dev/mapper/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 21984 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher Exit code: 99 21985 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher Stdout: u'' 21986 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher Stderr: u'/usr/bin/nova-rootwrap: Unauthorized command: ln --symbolic --force /dev/mapper/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 (no filter matched)\n' With the following filter set by default in OSP 7 for ln : /usr/share/nova/rootwrap/compute.filters:212:ln: RegExpFilter, ln, root, ln, --symbolic, --force, /dev/mapper/ip-.*-iscsi-iqn.*, /dev/disk/by-path/ip-.*-iscsi-iqn.* Changed upstream here : rootwrap: update ln --symbolic filter for FS and FC type volume drivers https://review.openstack.org/#/c/197713/ Lenny, does the following filter regex work for you instead of the default in /usr/share/nova/rootwrap/compute.filters or do you still see issues ? ln: RegExpFilter, ln, root, ln, --symbolic, --force, /dev/mapper/.*, .*
Created attachment 1084157 [details] logs Reproduce: testr run tempest.scenario.test_minimum_basic.TestMinimumBasicScenario Traceback 2015-10-18 15:42:21.204 29999 TRACE oslo_messaging.rpc.dispatcher or all(map(lambda x: not os.path.exists(x), host_devices))): 2015-10-18 15:42:21.204 29999 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py", line 476, in <lambda> 2015-10-18 15:42:21.204 29999 TRACE oslo_messaging.rpc.dispatcher or all(map(lambda x: not os.path.exists(x), host_devices))): 2015-10-18 15:42:21.204 29999 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib64/python2.7/genericpath.py", line 18, in exists 2015-10-18 15:42:21.204 29999 TRACE oslo_messaging.rpc.dispatcher os.stat(path) 20
(In reply to Lee Yarwood from comment #6) > (In reply to Lenny Verkhovsky from comment #0) > > Description of problem: > > > > > > Regarding code in > > /usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py > > > > # Check host_device only when transport is used, since otherwise it > > is > > # directly derived from properties. Only needed for unit tests > > # Loop until at least 1 path becomes available > > while ((self._get_transport() != "default" and not host_devices) > > or all(map(lambda x: not os.path.exists(x), host_devices))): > > > > > > Since in Nova kilo host_device initialized to None, and stays None in case > > /dev/disk/by-path wasn’t updated on the first try > > (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume. > > py#L842-L846), but in packstack code host_device modified to a list, then it > > won’t pass the os.path.exists() for the None initialized host_device param > > like in > > (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume. > > py#L446-L447) since os.path.exists crash for None values. > > > > One solution is to remove all the None values from the host_devices > > I might be missing something here but all I see in the logs is an incorrect > rootwrap filter due to pci identifiers for the HBA being included in the > paths : > > 21982 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > ProcessExecutionError: Unexpected error while running command. > 21983 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > Command: sudo nova-rootwrap /etc/nova/rootwrap.conf ln --symbolic --force > /dev/mapper/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack: > volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 > /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org. > openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 > 21984 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher Exit > code: 99 > 21985 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > Stdout: u'' > 21986 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > Stderr: u'/usr/bin/nova-rootwrap: Unauthorized command: ln --symbolic > --force > /dev/mapper/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack: > volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 > /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org. > openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 (no filter > matched)\n' > > With the following filter set by default in OSP 7 for ln : > > /usr/share/nova/rootwrap/compute.filters:212:ln: RegExpFilter, ln, root, ln, > --symbolic, --force, /dev/mapper/ip-.*-iscsi-iqn.*, > /dev/disk/by-path/ip-.*-iscsi-iqn.* > > Changed upstream here : > > rootwrap: update ln --symbolic filter for FS and FC type volume drivers > https://review.openstack.org/#/c/197713/ > > Lenny, does the following filter regex work for you instead of the default > in /usr/share/nova/rootwrap/compute.filters or do you still see issues ? > > ln: RegExpFilter, ln, root, ln, --symbolic, --force, /dev/mapper/.*, .* This is indeed another issue ( encrypt/decrypt ). We are checking this and will open another bug if needed. Those logs were also created with pacthed volume.py. I've attached new logs with reproduction of the issue in the next comment.
(In reply to Lenny Verkhovsky from comment #8) > (In reply to Lee Yarwood from comment #6) > > (In reply to Lenny Verkhovsky from comment #0) > > > Description of problem: > > > > > > > > > Regarding code in > > > /usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py > > > > > > # Check host_device only when transport is used, since otherwise it > > > is > > > # directly derived from properties. Only needed for unit tests > > > # Loop until at least 1 path becomes available > > > while ((self._get_transport() != "default" and not host_devices) > > > or all(map(lambda x: not os.path.exists(x), host_devices))): > > > > > > > > > Since in Nova kilo host_device initialized to None, and stays None in case > > > /dev/disk/by-path wasn’t updated on the first try > > > (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume. > > > py#L842-L846), but in packstack code host_device modified to a list, then it > > > won’t pass the os.path.exists() for the None initialized host_device param > > > like in > > > (https://github.com/openstack/nova/blob/stable/kilo/nova/virt/libvirt/volume. > > > py#L446-L447) since os.path.exists crash for None values. > > > > > > One solution is to remove all the None values from the host_devices > > > > I might be missing something here but all I see in the logs is an incorrect > > rootwrap filter due to pci identifiers for the HBA being included in the > > paths : > > > > 21982 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > > ProcessExecutionError: Unexpected error while running command. > > 21983 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > > Command: sudo nova-rootwrap /etc/nova/rootwrap.conf ln --symbolic --force > > /dev/mapper/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack: > > volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 > > /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org. > > openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 > > 21984 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher Exit > > code: 99 > > 21985 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > > Stdout: u'' > > 21986 2015-10-08 10:58:39.597 3415 TRACE oslo_messaging.rpc.dispatcher > > Stderr: u'/usr/bin/nova-rootwrap: Unauthorized command: ln --symbolic > > --force > > /dev/mapper/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack: > > volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 > > /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org. > > openstack:volume-b21ab8b2-745c-4992-a29f-88993a9b591b-lun-0 (no filter > > matched)\n' > > > > With the following filter set by default in OSP 7 for ln : > > > > /usr/share/nova/rootwrap/compute.filters:212:ln: RegExpFilter, ln, root, ln, > > --symbolic, --force, /dev/mapper/ip-.*-iscsi-iqn.*, > > /dev/disk/by-path/ip-.*-iscsi-iqn.* > > > > Changed upstream here : > > > > rootwrap: update ln --symbolic filter for FS and FC type volume drivers > > https://review.openstack.org/#/c/197713/ > > > > Lenny, does the following filter regex work for you instead of the default > > in /usr/share/nova/rootwrap/compute.filters or do you still see issues ? > > > > ln: RegExpFilter, ln, root, ln, --symbolic, --force, /dev/mapper/.*, .* > > This is indeed another issue ( encrypt/decrypt ). We are checking this and > will open another bug if needed. Those logs were also created with pacthed > volume.py. > > I've attached new logs with reproduction of the issue in the next comment. https://review.openstack.org/#/c/197713 fixed the encryption issue tempest.scenario.test_encrypted_cinder_volumes.TestEncryptedCinderVolumes is passing now.
(In reply to Lenny Verkhovsky from comment #9) > https://review.openstack.org/#/c/197713 fixed the encryption issue > tempest.scenario.test_encrypted_cinder_volumes.TestEncryptedCinderVolumes > is passing now. Thanks, apologies for the delay, I've opened the following RHBZ to track this into RHEL OSP 7.0.z : Nova volume encryptors attach volume fails for NFS and FC (rootwrap) https://bugzilla.redhat.com/show_bug.cgi?id=1273466 (In reply to Lenny Verkhovsky from comment #8) > This is indeed another issue ( encrypt/decrypt ). We are checking this and > will open another bug if needed. Those logs were also created with pacthed > volume.py. > > I've attached new logs with reproduction of the issue in the next comment. Thanks that's better, as you've pointed out we have additionally backported the now abandoned Nova iscsi-multipath-enhancement [1] blueprint into RHEL OSP 7 [2]. As a result _get_host_devices can return a list containing a _single_ None value. IMHO there's no need to loop over all host_devices and we can just address this within the lambda function while testing if the path exists : # git diff diff --git a/nova/virt/libvirt/volume.py b/nova/virt/libvirt/volume.py index 3f3cc2b..7f668de 100644 --- a/nova/virt/libvirt/volume.py +++ b/nova/virt/libvirt/volume.py @@ -470,7 +470,8 @@ class LibvirtISCSIVolumeDriver(LibvirtBaseVolumeDriver): # directly derived from properties. Only needed for unit tests # Loop until at least 1 path becomes available while ((self._get_transport() != "default" and not host_devices) - or all(map(lambda x: not os.path.exists(x), host_devices))): + or all(map(lambda x: x is None or not os.path.exists(x), + host_devices))): if tries >= self.num_scan_tries: raise exception.NovaException(_("iSCSI device not found at %s") % (host_devices)) Can you test this and confirm that it works as expected for you? Happy to reconsider your original patch if you feel I've missed something here. Thanks in advance, Lee [1] https://review.openstack.org/#/c/134683/ [2] https://bugzilla.redhat.com/show_bug.cgi?id=1228295
Lee, the proposed patch is not enough. the while loop keeps scanning until it fails at if tries >= self.num_scan_tries: 2015-10-21 09:06:30.120 12553 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py", line 480, in connect_volume 2015-10-21 09:06:30.120 12553 TRACE oslo_messaging.rpc.dispatcher % (host_devices)) 2015-10-21 09:06:30.120 12553 TRACE oslo_messaging.rpc.dispatcher NovaException: iSCSI device not found at [None] Please, consider the following patch that fixes the root cause: --- a/nova/virt/libvirt/volume.py +++ b/nova/virt/libvirt/volume.py +++ /tmp/volume.py 2015-10-21 09:21:03.957542634 +0300 @@ -900,11 +900,11 @@ class LibvirtISCSIVolumeDriver(LibvirtBa if self._get_transport() == "default": return [("/dev/disk/by-path/%s" % device)] else: - host_device = None + host_device = [] look_for_device = glob.glob('/dev/disk/by-path/*%s' % device) if look_for_device: - host_device = look_for_device[0] - return [host_device] + host_device = [look_for_device[0]] + return host_device def _reconnect(self, iscsi_properties): # Note: iscsiadm does not support changing iface.iscsi_ifacename
(In reply to Lenny Verkhovsky from comment #11) > Lee, the proposed patch is not enough. > the while loop keeps scanning until it fails at > if tries >= self.num_scan_tries: > > 2015-10-21 09:06:30.120 12553 TRACE oslo_messaging.rpc.dispatcher File > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py", line 480, in > connect_volume > 2015-10-21 09:06:30.120 12553 TRACE oslo_messaging.rpc.dispatcher % > (host_devices)) > 2015-10-21 09:06:30.120 12553 TRACE oslo_messaging.rpc.dispatcher > NovaException: iSCSI device not found at [None] Thanks Lenny, can you provide the full log including the retries prior to this? Also the current iscsi_iface setting with `openstack-config --get /etc/nova/nova.conf libvirt iscsi_iface`. > Please, consider the following patch that fixes the root cause: > > --- a/nova/virt/libvirt/volume.py > +++ b/nova/virt/libvirt/volume.py > +++ /tmp/volume.py 2015-10-21 09:21:03.957542634 +0300 > @@ -900,11 +900,11 @@ class LibvirtISCSIVolumeDriver(LibvirtBa > if self._get_transport() == "default": > return [("/dev/disk/by-path/%s" % device)] > else: > - host_device = None > + host_device = [] > look_for_device = glob.glob('/dev/disk/by-path/*%s' % device) > if look_for_device: > - host_device = look_for_device[0] > - return [host_device] > + host_device = [look_for_device[0]] > + return host_device > > def _reconnect(self, iscsi_properties): > # Note: iscsiadm does not support changing iface.iscsi_ifacename While this corrects the issue of os.path.exists() being called with a None value I think you're missing the impact of _get_host_devices not returning None during connect_volumes when no device paths are present. The following will ensure that the retry loop is used correctly, without calling os.path.exists with None while still allowing host_devices to be refreshed at the end of each iteration : # git diff diff --git a/nova/virt/libvirt/volume.py b/nova/virt/libvirt/volume.py index 3f3cc2b..caefef4 100644 --- a/nova/virt/libvirt/volume.py +++ b/nova/virt/libvirt/volume.py @@ -469,7 +469,7 @@ class LibvirtISCSIVolumeDriver(LibvirtBaseVolumeDriver): # Check host_device only when transport is used, since otherwise it is # directly derived from properties. Only needed for unit tests # Loop until at least 1 path becomes available - while ((self._get_transport() != "default" and not host_devices) + while ((self._get_transport() != "default" and host_devices == [None]) or all(map(lambda x: not os.path.exists(x), host_devices))): if tries >= self.num_scan_tries: raise exception.NovaException(_("iSCSI device not found at %s") @@ -488,12 +488,13 @@ class LibvirtISCSIVolumeDriver(LibvirtBaseVolumeDriver): # For offloaded open-iscsi transports, host_device cannot be # guessed unlike iscsi_tcp where it can be obtained from # properties, so try and get it again. - if not host_devices and self._get_transport() != "default": + if host_devices == [None] and self._get_transport() != "default": host_devices = self._get_host_devices(iscsi_properties) tries = tries + 1 - if all(map(lambda x: not os.path.exists(x), host_devices)): + if (host_devices == [None] or + all(map(lambda x: not os.path.exists(x), host_devices))): time.sleep(tries ** 2) else: break Can you revert your changes and test the above in your environment? If this fails please provide a full DEBUG enabled log from nova-compute. Thanks in advance, Lee
Hi Lee, your latest patch was applied and verified as working. ( Do you still need logs for patch with retries error ? )
(In reply to Lenny Verkhovsky from comment #13) > Hi Lee, > your latest patch was applied and verified as working. > ( Do you still need logs for patch with retries error ? ) Please, just to review and ensure I've not missed anything here as I don't have an ISER environment to hand.
Created attachment 1085089 [details] logs Small update in patch ( see below ) log and dmesg log attached --- volume.py +++ volume.py @@ -470,7 +470,8 @@ class LibvirtISCSIVolumeDriver(LibvirtBa # directly derived from properties. Only needed for unit tests # Loop until at least 1 path becomes available while ((self._get_transport() != "default" and not host_devices) - or all(map(lambda x: not os.path.exists(x), host_devices))): + or all(map(lambda x: x is None or not os.path.exists(x), + host_devices))): if tries >= self.num_scan_tries: raise exception.NovaException(_("iSCSI device not found at %s") % (host_devices)) @@ -493,7 +494,8 @@ class LibvirtISCSIVolumeDriver(LibvirtBa tries = tries + 1 - if all(map(lambda x: not os.path.exists(x), host_devices)): + if all(map(lambda x: x is None or not os.path.exists(x), + host_devices)): time.sleep(tries ** 2) else: break
(In reply to Lenny Verkhovsky from comment #15) > Created attachment 1085089 [details] > logs > > Small update in patch ( see below ) > log and dmesg log attached I don't see a successful attach in the logs, are these the correct logs? 333 2015-10-21 13:53:19.358 30448 TRACE oslo_messaging.rpc.dispatcher NovaException: iSCSI device not found at [None] Can you also provide output for the following commands : # openstack-config --get /etc/nova/nova.conf libvirt iscsi_iface # ll /dev/disk/by-path/*ip-*-iscsi-*-lun-*
I believe that logs are correct. I've re-run it with the same result. # openstack-config --get /etc/nova/nova.conf libvirt iscsi_iface Parameter not found: iscsi_iface [root@r-ufm173 openstack-tempest-kilo]# ll /dev/disk/by-path/*ip-*-iscsi-*-lun-* lrwxrwxrwx. 1 root root 9 Oct 18 16:30 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-23b19a42-d9ee-4d32-bea8-6aea8b75faf6-lun-0 -> ../../sdb lrwxrwxrwx. 1 root root 9 Oct 20 21:48 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-2774e7f0-b163-4a11-b703-c85ac6a3dbe0-lun-0 -> ../../sdl lrwxrwxrwx. 1 root root 9 Oct 21 09:05 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-2dd18dfd-f805-45fd-ab33-6cc9b4b9a728-lun-0 -> ../../sdn lrwxrwxrwx. 1 root root 9 Oct 20 17:24 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-4349d98a-7119-4014-9cb3-6cc9fee06517-lun-0 -> ../../sdf lrwxrwxrwx. 1 root root 9 Oct 20 17:55 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-4bb04f8e-6cbc-4695-86b3-93e32e7fac9f-lun-0 -> ../../sdi lrwxrwxrwx. 1 root root 9 Oct 20 17:18 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-53b7d22a-913c-45a1-a19a-c77a39580d50-lun-0 -> ../../sde lrwxrwxrwx. 1 root root 9 Oct 20 17:48 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-8aedc697-126a-4adb-a674-ed9719acdf1f-lun-0 -> ../../sdh lrwxrwxrwx. 1 root root 9 Oct 20 21:23 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-9096d9d3-1fd3-4f1b-8c93-1aa2c8268f07-lun-0 -> ../../sdj lrwxrwxrwx. 1 root root 9 Oct 20 17:36 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-9dc5477b-4ae2-4e88-b8c4-f69da6b647bd-lun-0 -> ../../sdg lrwxrwxrwx. 1 root root 9 Oct 21 13:52 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-b766c2e3-3a43-4ca9-a674-5bfa87eda1e0-lun-0 -> ../../sdo lrwxrwxrwx. 1 root root 9 Oct 21 08:53 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-c4329d31-23d2-45f4-aee3-3fa220cc9f85-lun-0 -> ../../sdm lrwxrwxrwx. 1 root root 9 Oct 20 21:30 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-d94eb716-84f1-4142-8bf7-97ec2fde95f4-lun-0 -> ../../sdk
(In reply to Lenny Verkhovsky from comment #17) > I believe that logs are correct. I've re-run it with the same result. The following snippet from the logs clearly shows a failure to attach the b766c2e3 volume to the 1d1485ff instance. Was this volume and instance part of your test run that succeeded? 223 2015-10-21 13:52:12.670 30448 INFO nova.compute.manager [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] Attaching volume b766c2e3-3a43-4ca9-a674-5bfa87eda1e0 to /dev/vdb 224 2015-10-21 13:52:22.781 30448 WARNING nova.virt.libvirt.volume [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] ISCSI volume not yet found at: vdb. Will rescan & retry. Try number: 0 225 2015-10-21 13:52:23.904 30448 WARNING nova.virt.libvirt.volume [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] ISCSI volume not yet found at: vdb. Will rescan & retry. Try number: 1 226 2015-10-21 13:52:28.029 30448 WARNING nova.virt.libvirt.volume [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] ISCSI volume not yet found at: vdb. Will rescan & retry. Try number: 2 227 2015-10-21 13:52:37.153 30448 WARNING nova.virt.libvirt.volume [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] ISCSI volume not yet found at: vdb. Will rescan & retry. Try number: 3 [..] 233 2015-10-21 13:52:53.286 30448 WARNING nova.virt.libvirt.volume [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] ISCSI volume not yet found at: vdb. Will rescan & retry. Try number: 4 234 2015-10-21 13:53:18.410 30448 ERROR nova.virt.block_device [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] Driver failed to attach volume b766c2e3-3a43-4ca9-a674-5bfa87eda1e0 at /dev/vdb 235 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] Traceback (most recent call last): 236 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 253, in attach 237 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] device_type=self['device_type'], encryption=encryption) 238 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1070, in attach_volume 239 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] self._connect_volume(connection_info, disk_info) 240 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1021, in _connect_volume 241 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] driver.connect_volume(connection_info, disk_info) 242 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 445, in inner 243 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] return f(*args, **kwargs) 244 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py", line 477, in connect_volume 245 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] % (host_devices)) 246 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] NovaException: iSCSI device not found at [None] 247 2015-10-21 13:53:18.410 30448 TRACE nova.virt.block_device [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] 248 2015-10-21 13:53:19.127 30448 ERROR nova.compute.manager [req-0e6ae146-2388-4521-9d86-a425dfbf3b70 3ca462dfcecd4c77b0745169f1088755 54f794084de847658a57b2033b4320d5 - - -] [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] Failed to attach b766c2e3-3a43-4ca9-a674-5bfa87eda1e0 at /dev/vdb 249 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] Traceback (most recent call last): 250 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4798, in _attach_volume 251 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] do_check_attach=False, do_driver_attach=True) 252 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 49, in wrapped 253 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] ret_val = method(obj, context, *args, **kwargs) 254 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 262, in attach 255 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] connector) 256 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__ 257 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] six.reraise(self.type_, self.value, self.tb) 258 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 253, in attach 259 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] device_type=self['device_type'], encryption=encryption) 260 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1070, in attach_volume 261 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] self._connect_volume(connection_info, disk_info) 262 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1021, in _connect_volume 263 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] driver.connect_volume(connection_info, disk_info) 264 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 445, in inner 265 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] return f(*args, **kwargs) 266 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume.py", line 477, in connect_volume 267 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] % (host_devices)) 268 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] NovaException: iSCSI device not found at [None] 269 2015-10-21 13:53:19.127 30448 TRACE nova.compute.manager [instance: 1d1485ff-030f-45e6-a0eb-330ddea4f41d] We can see this volume listed under /dev/disk/by-path/ below : lrwxrwxrwx. 1 root root 9 Oct 21 13:52 /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-b766c2e3-3a43-4ca9-a674-5bfa87eda1e0-lun-0 -> ../../sdo So either the path used during _get_host_devices to find this device is wrong _or_ we are not giving enough time for this device to come online in your environment. Can you add iscsi_properties and host_devices to the output of the LOG.warn() line within the retry loop? This should provide some more context here once reproduced : # git diff diff --git a/nova/virt/libvirt/volume.py b/nova/virt/libvirt/volume.py index 3f3cc2b..97fe542 100644 --- a/nova/virt/libvirt/volume.py +++ b/nova/virt/libvirt/volume.py @@ -469,15 +469,18 @@ class LibvirtISCSIVolumeDriver(LibvirtBaseVolumeDriver): # Check host_device only when transport is used, since otherwise it is # directly derived from properties. Only needed for unit tests # Loop until at least 1 path becomes available - while ((self._get_transport() != "default" and not host_devices) + while ((self._get_transport() != "default" and host_devices == [None]) or all(map(lambda x: not os.path.exists(x), host_devices))): if tries >= self.num_scan_tries: raise exception.NovaException(_("iSCSI device not found at %s") % (host_devices)) - LOG.warn(_LW("ISCSI volume not yet found at: %(disk_dev)s. " + LOG.warn(_LW("ISCSI volume not yet found for: %(disk_dev)s with" + "iscsi_properties: %(iscsi_props), " + "host_devices returned: %(host_devices)." "Will rescan & retry. Try number: %(tries)s"), - {'disk_dev': disk_dev, 'tries': tries}) + {'disk_dev': disk_dev, 'iscsi_props':iscsi_properties, + 'host_devices': host_devices, 'tries': tries}) # The rescan isn't documented as being necessary(?), but it helps if self.use_multipath: @@ -488,12 +491,13 @@ class LibvirtISCSIVolumeDriver(LibvirtBaseVolumeDriver): # For offloaded open-iscsi transports, host_device cannot be # guessed unlike iscsi_tcp where it can be obtained from # properties, so try and get it again. - if not host_devices and self._get_transport() != "default": + if host_devices == [None] and self._get_transport() != "default": host_devices = self._get_host_devices(iscsi_properties) tries = tries + 1 - if all(map(lambda x: not os.path.exists(x), host_devices)): + if (host_devices == [None] or + all(map(lambda x: not os.path.exists(x), host_devices))): time.sleep(tries ** 2) else: break
Created attachment 1085414 [details] logs I've attached logs. 1. the following dev was attached and left after test /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org.openstack:volume-03a54fcb-d13b-44fc-a78e-b0c5353b2f99-lun-0 -> ../../sdq 2. I've used my patch from c#15 with log message from c#18
(In reply to Lenny Verkhovsky from comment #19) > Created attachment 1085414 [details] > logs > > I've attached logs. You have missed nova-compute.log for this run, can you please provide it? > 1. the following dev was attached and left after test > /dev/disk/by-path/pci-0000:02:00.0-ip-1.1.1.1:3260-iscsi-iqn.2010-10.org. > openstack:volume-03a54fcb-d13b-44fc-a78e-b0c5353b2f99-lun-0 -> ../../sdq In the tempest logs we can see volume 03a54fcb fail to reach an in-use (attached) status : tmp/20151022_tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.log 3453 Traceback (most recent call last): 3454 File "/usr/share/openstack-tempest-kilo/tempest/test.py", line 126, in wrapper 3455 return f(self, *func_args, **func_kwargs) 3456 File "/usr/share/openstack-tempest-kilo/tempest/scenario/test_minimum_basic.py", line 134, in test_minimum_basic_scenario 3457 self.nova_volume_attach() 3458 File "/usr/share/openstack-tempest-kilo/tempest/scenario/test_minimum_basic.py", line 84, in nova_volume_attach 3459 self.volumes_client.wait_for_volume_status(volume['id'], 'in-use') 3460 File "/usr/share/openstack-tempest-kilo/tempest/services/volume/json/volumes_client.py", line 182, in wait_for_volume_status 3461 raise exceptions.TimeoutException(message) 3462 TimeoutException: Request timed out 3463 Details: Volume 03a54fcb-d13b-44fc-a78e-b0c5353b2f99 failed to reach in-use status (current: available) within the required time (196 s). 3464 Ran 1 tests in 237.729s (+169.925s) > 2. I've used my patch from c#15 with log message from c#18 The patch in c#15 is not required here. Can you please attach nova-compute logs from this run for review with the additional logging provided by the patch in c#18?
Thanks again for your time on the phone Lenny.Removing the NEEDINFO against you. Just to confirm that the patch in c#18 does correct this for you. The logs provided in c#19 are using your patch from c#15. Apologies again for the confusion. Unfortunately this fix is going to be a downstream-only given the use of the now abandoned Nova iscsi-multipath-enhancement blueprint in RHEL OSP 7. I'll propose this internally now and aim to get you a test build later today.
*** Bug 1272884 has been marked as a duplicate of this bug. ***
The fix is in the code. the ticket has been verified by third party. (see comment 25)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:2673