Red Hat Bugzilla – Bug 1281909
Errors when resizing devices after disconnecting storage server during maintenance flow
Last modified: 2016-01-19 10:37:24 EST
Created attachment 1093791 [details]
Vdsm log showing the errors
Description of problem:
When moving host to maintenance vdsm disconnect all storage servers.
As part of the disconnect, vdsm perform a storage refresh operation.
This includes resizing of all multipath devices.
There seems to be a race between removing iscsi session and removal
of sysfs devices, so when we enumerate devices after disconnecting
from storage server, we get various errors:
Multipath device with not slaves:
2015-11-13 20:20:55,252 WARNING [Storage.Multipath] (jsonrpc.Executor/3) Map '3600140587a1af8ecb9e4fa9ad76f9b28' has no slaves [multipath:107(_resize_if_needed)]
This device will probably disappear soon.
Devices without a missing /sys/block/sdi/queue/logical_block_size:
2015-11-13 20:20:55,253 ERROR [Storage.Multipath] (jsonrpc.Executor/3) Could not resize device 360014052f7915069dd94a6eaf25b4edf [multipath:98(resize_devices)]
Traceback (most recent call last):
File "/usr/share/vdsm/storage/multipath.py", line 96, in resize_devices
File "/usr/share/vdsm/storage/multipath.py", line 104, in _resize_if_needed
for slave in devicemapper.getSlaves(name)]
File "/usr/share/vdsm/storage/multipath.py", line 161, in getDeviceSize
bs, phyBs = getDeviceBlockSizes(devName)
File "/usr/share/vdsm/storage/multipath.py", line 153, in getDeviceBlockSizes
IOError: [Errno 2] No such file or directory: '/sys/block/sdi/queue/logical_block_size'
Version-Release number of selected component (if applicable):
Since multipath device resizing is supported
Steps to Reproduce:
1. Get setup with ISCSI storage domain
2. Put host to maintenance
Warnings and errors during disconnect storage server flow
There are two issues:
1. Resizing devices is not needed after disconnect.
This opertion is needed only when:
1. connecting to storage
2. getting device list
3. performing operations vg operations
Currently this operation is part of storage refresh, which is needed
when disconnecting from a storage server.
2. We don't wait until iscsi session is removed, racing with scsi
system when enumerating devices
Solving the first issue will probably eliminate the second.
This is a regression caused by adding support for device resizing, but it
is does not effect the functionality of the system.
Please set target release or I can't move the bug to ON_QA automatically.
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Right now, vdsm doesn't disconnect its iSCSI sessions upon storage domain deactivation due to BZ #1279485.
Fred/Nir, should we wait for the fix of BZ #1279485 in order to test the scenario described here or will it be OK to test it with manual intervention in iSCSI sessions disconnection as a workaround?
Elad, this issue appeared only after the fix in BZ #1279485.
I don't think you can test it without it.
No IOError while putting host to maintenance on a DC with active iSCSI domains.
Host moves to maintenance and iSCSI sessions get disconnected successfully.