Description of problem: Setting SD to maintenance fails and turns the SD to inactive mode. Reproduced this on ISCSI,FCP. Doesn't reproduce on NFS and gluster. After the SD turns to inactive mode, it is possible to activate it. Version-Release number of selected component (if applicable): ovirt-engine-4.5.0-582.gd548206.185.el8ev.noarch vdsm-4.50.0.5-1.el8ev.x86_64 How reproducible: 100% Steps to Reproduce: Use the web admin UI-> got to "Compute" -> "data centers" -> choose iscsi SD or FCP -> set to maintenance mode Actual results: after a few seconds you will get message like this: "VDSM host_mixed_1 command DisconnectStorageServerVDS failed: Error storage server disconnection: ('domType=3, spUUID=a370dd63-8234-47db-a373-3040ea63f4e1, conList=[]',) VDSM host_mixed_3 command DisconnectStorageServerVDS failed: Error storage server disconnection: ('domType=3, spUUID=a370dd63-8234-47db-a373-3040ea63f4e1, conList=[]',) VDSM host_mixed_2 command DisconnectStorageServerVDS failed: Error storage server disconnection: ('domType=3, spUUID=a370dd63-8234-47db-a373-3040ea63f4e1, conList=[]',) Failed to deactivate Storage Domain iscsi_0 (Data Center golden_env_mixed)." and on the SPM host we get a Traceback: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/dispatcher.py", line 74, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 110, in wrapper return m(self, *a, **kw) File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 1190, in prepare raise self.error File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 884, in _run return fn(*args, **kargs) File "<decorator-gen-119>", line 2, in disconnectStorageServer File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2232, in disconnectStorageServer results = storageServer.disconnect(domType, conList) File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 932, in disconnect con_class, connections = _prepare_connections(dom_type, con_defs) File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 942, in _prepare_connections con_class = ConnectionFactory.registeredConnectionTypes[con_info.type] UnboundLocalError: local variable 'con_info' referenced before assignment Expected results: Setting the host to maintenance should succeed without any failures. Additional info: This is a regression, that does not reproduce in 4.4.10 latest build. Seems like the fix from this bug influenced this action (https://bugzilla.redhat.com/1787192)
This is caused by engine, not sending connection info to vdsm: 2022-02-15 15:48:25,654+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (EE-ManagedThreadFactory-engine-Thread-37827) [storagedomains_syncAction_bb8aad06-f] START, DisconnectStorageServerVDSCommand(HostName = host_mixed_3, StorageServerConnectionManagementVDSParameters:{hostId='1d0c3077-9101-4685-bfbb-4efadc7c8899', storagePoolId='a370dd63-8234-47db-a373-3040ea63f4e1', storageType='ISCSI', connectionList='[]', sendNetworkEventOnFailure='true'}), log id: 5499f753 connectionList is empty list, which caused vdsm to crash. This being said, vdsm code can be improved to handle this situation more gracefully.
workaround exists: "After the SD turns to inactive mode, it is possible to activate it." -> high severity.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
this happen only when there are more SDs connected to the same iSCSI target and in such case we cannot disconnect the tager, i.e. this is expected engine behaviour and should be fixed in vdsm to do nothing when connection list is empty.
We need engine bug, it should never send disconnect request without connections to disconnect. This is invalid request. Unfortunately we must support broken engines so we cannot fail the request. The vdsm schema should be updated to require non-empty connection list for both connectStorageServer and disconnectStorageServer.
(In reply to Michal Skrivanek from comment #3) > workaround exists: "After the SD turns to inactive mode, it is possible to > activate it." -> high severity. Yes, it has a WA for activation but the fact remains the SD can not be moved to maintenance so customers can not detach SD and move to a different DC or engine which is a huge blocker IMO -> changing to urgent.
Verified successfully. Versions: ovirt-engine-4.5.0-743c0a787472.211.el8ev.noarch vdsm-4.50.0.7-1.el8ev Verified steps: 1) Maintenance on SDs(FCP, ISCSI, NFS, Gluster) on HE environment and on a regular one. 2) Activate all the SDs that were set on maintenance mode. Expected results: All steps should pass successfully without error logs. Actual results: As expected.
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.