Description of problem: Faulty multipath devices left on compute nodes after deleting instances which have cinder attached Version-Release number of selected component (if applicable): OSP 6.0 EMC iSCSI storage How reproducible: 100% Steps to Reproduce: 1. Boot an instance from volume 2. Attach a cinder volume 3. Delete the instance Actual results: Faulty devices are left on the compute node. Expected results: No faulty devices should be left on the compute node. Additional info: Logs have been uploaded to collab-shell. For details please check comment #1
So the issue here is that when there are no additional LUNs provided by the IQN Nova simply disconnects from the target portal. This removes the path devices but keeps the multipath device in place. When there are additional LUNs provided by the IQN nova deletes the paths _and_ multipath device. nova/virt/libvirt/volume.py 223 class LibvirtISCSIVolumeDriver(LibvirtBaseVolumeDriver): 224 """Driver to attach Network volumes to libvirt.""" 399 @utils.synchronized('connect_volume') 400 def disconnect_volume(self, connection_info, disk_dev): 401 """Detach the volume from instance_name.""" [..] 416 if self.use_multipath and multipath_device: 417 return self._disconnect_volume_multipath_iscsi(iscsi_properties, 418 multipath_device) 470 def _disconnect_volume_multipath_iscsi(self, iscsi_properties, 471 multipath_device): [..] 511 # Get a target for all other multipath devices 512 other_iqns = [self._get_multipath_iqn(device) 513 for device in devices] 514 # Get all the targets for the current multipath device 515 current_iqns = [iqn for ip, iqn in ips_iqns] 516 517 in_use = False 518 for current in current_iqns: 519 if current in other_iqns: 520 in_use = True 521 break 522 523 # If no other multipath device attached has the same iqn 524 # as the current device 525 if not in_use: 526 # disconnect if no other multipath devices with same iqn 527 self._disconnect_mpath(iscsi_properties, ips_iqns) 528 return 529 elif multipath_device not in devices: 530 # delete the devices associated w/ the unused multipath 531 self._delete_mpath(iscsi_properties, multipath_device, ips_iqns) 532 533 # else do not disconnect iscsi portals, 534 # as they are used for other luns, 535 # just remove multipath mapping device descriptor 536 self._remove_multipath_device_descriptor(multipath_device) 537 return In Liberty os-brick removes the paths and mpath device first before deciding if we need to disconnect from the portal. This behaviour was present in Cinder before the fork into os-brick so I can try to port this across into Nova prior to our use of os-brick.
This will make the performance of volume attach/detach much better, but it can not resolve faulty device issue. As you mentioned, os-brick remove the paths and mpath device first and then disconnect from the portal. The faulty device will exists if another volume attach/detach happens between the deletion of paths/mpath devices and disconnection from the portal. Because volume attach/detach will trigger the command rescan which will generate the paths and mpath devices again.