Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1599233

Summary: Device removal seams to be racy with cinder backup
Product: Red Hat OpenStack Reporter: Attila Fazekas <afazekas>
Component: python-os-brickAssignee: Cinder Bugs List <cinder-bugs>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Shai Revivo <srevivo>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: afazekas, apevec, cschwede, eharney, jschluet, lhh, tshefi
Target Milestone: ---Flags: tshefi: automate_bug-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-01 13:25:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Attila Fazekas 2018-07-09 09:25:45 UTC
Description of problem:
Parallel test run leaded to failed device removal.
According to message log possible it was removed, but another one with same non unique[sda] name was created. 

- consider using full device path
- consider checking is the remaining device(s) really the device(s) you think
- consider listening on device removal events or (file-system events) instead of periodic sampling


Version-Release number of selected component (if applicable):
python-os-brick.noarch              1.6.2-3.el7ost      @rhelosp-10.0-puddle    
python-cinder.noarch                1:9.1.4-33.el7ost   @rhelosp-10.0-puddle    


How reproducible:
Unknown, less than 25%

Steps to Reproduce:
1. packstack setup
2. tempest on 4 thread
3.

Actual results:
Random test case failure

Expected results:
Stable tempest run

Additional info:
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 30:0:0:0: alua: Detached
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 30:0:0:0: [sda] Synchronizing SCSI cache

Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: scsi host31: iSCSI Initiator over TCP/IP
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: scsi 31:0:0:0: Direct-Access     LIO-ORG  IBLOCK           4.0  PQ: 0 ANSI: 5
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: scsi 31:0:0:0: alua: supports implicit and explicit TPGS
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: scsi 31:0:0:0: alua: device naa.6001405efac45abae7440799bd809719 port group 0 rel port 1
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: scsi 31:0:0:0: alua: Attached
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: [sda] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: Attached scsi generic sg0 type 0
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: alua: transition timeout set to 60 seconds
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: [sda] Write Protect is off
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: iSCSI/iqn.1994-05.com.redhat:31589f129b8: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
Jun 29 02:21:01 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 kernel: sd 31:0:0:0: [sda] Attached SCSI disk


Jun 29 02:21:07 d75531bf77f0c7d7d5b02fd78ec0e987-aio-0 cinder-backup: 2018-06-29 02:21:07.382 27795 WARNING os_brick.exception [req-bd1aa7c1-0b3d-4a2c-8d9b-b918ad63c2e7 e32cb7ddecff409fbbca9f446219a090 e1820956e5ac41e68a62968e19ffcd02 - default default] Some devices remain from set([u'sda'])

2018-06-29 02:21:07.417 27795 WARNING os_brick.initiator.connectors.iscsi [req-bd1aa7c1-0b3d-4a2c-8d9b-b918ad63c2e7 e32cb7ddecff409fbbca9f446219a090 e1820956e5ac41e68a62968e19ffcd02 - default default] There were errors removing set([u'sda']), leftovers may remain in the system
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server [req-bd1aa7c1-0b3d-4a2c-8d9b-b918ad63c2e7 e32cb7ddecff409fbbca9f446219a090 e1820956e5ac41e68a62968e19ffcd02 - default default] Exception during message handling
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/cinder/backup/manager.py", line 443, in restore_backup
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     backup.save()
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     self.force_reraise()
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/cinder/backup/manager.py", line 437, in restore_backup
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     self._run_restore(context, backup, volume)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/cinder/backup/manager.py", line 483, in _run_restore
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     force=True)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/cinder/backup/manager.py", line 895, in _detach_device
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     ignore_errors=ignore_errors)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 137, in trace_logging_wrapper
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 830, in disconnect_volume
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     ignore_errors=ignore_errors)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 887, in _cleanup_connection
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     raise exc
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server ExceptionChainer: 
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server Chained Exception #1
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   Traceback (most recent call last):
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 231, in remove_connection
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server       self.wait_for_volumes_removal(devices_names)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     File "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 61, in _wrapper
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server       return r.call(f, *args, **kwargs)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server       raise attempt.get()
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server       six.reraise(self.value[0], self.value[1], self.value[2])
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server       attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server     File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 90, in wait_for_volumes_removal
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server       raise exception.VolumePathNotRemoved(volume_path=exist)
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   VolumePathNotRemoved: Volume path [u'sda'] was not removed in time.
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server   
2018-06-29 02:21:07.467 27795 ERROR oslo_messaging.rpc.server

Comment 8 Tzach Shefi 2019-04-29 12:41:51 UTC
Closed, nothing to test automate per close loop.