Created attachment 892281 [details] logs from engine and host Description of problem: Configured an iSCSI multipath bond. I tried to replace the networks which are participating in the bond and failed because vdsm was unable to connect to the storage server via the replacing network. This failure wasn't caught right in engine, I got this in webadmin: Operation Canceled Error while executing action EditIscsiBond: Internal Engine Error Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: On a shared DC with active iSCSI storage domain(s): 1. Create 3 new networks and attach them to the cluster with required check-box checked 2. Attach the networks to the cluster's hosts NICs 3. Create a new iSCSI multipath bond (under DC tab -> pick the relevant DC -> iSCSI multipath sub-tab -> new) and add 2 of the new networks along which the targets to it 4. Maintenance the iSCSI domain and activate it so the connection to the storage will be done from the new networks 5. After the iSCSI domain is active, edit the multipath bond, uncheck the checked networks an pick the third network. Click 'Ok' 6. VDSM will fail to perform the operation with IscsiNodeError Actual results: Engine doesn't know how to handle with the error from vdsm and throws the following error message in the log which represents in an internal engine error in webadmin: 2014-05-04 15:39:56,678 ERROR [org.ovirt.engine.core.bll.storage.EditIscsiBondCommand] (ajp-/127.0.0.1:8702-10) [37fea003] Command org.ovirt.engine.core.bll.storage.EditIscsiBon dCommand throw exception: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.en gine.core.vdsbroker.vdsbroker.VDSNetworkException: java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR and code 5022) at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil.invokeAll(ThreadPoolUtil.java:205) [utils.jar:] at org.ovirt.engine.core.bll.storage.BaseIscsiBondCommand.connectAllHostsToStorage(BaseIscsiBondCommand.java:56) [bll.jar:] at org.ovirt.engine.core.bll.storage.EditIscsiBondCommand.executeCommand(EditIscsiBondCommand.java:70) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1123) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1208) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1884) [bll.jar:] Expected results: Engine should know how to handle with such an error from vdsm, to notify user and to revert the operation Additional info: logs from engine and host The error in vdsm: Thread-1846::ERROR::2014-05-04 15:40:44,971::hsm::2379::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 359, in connect iscsi.addIscsiNode(self._iface, self._target, self._cred) File "/usr/share/vdsm/storage/iscsi.py", line 166, in addIscsiNode iscsiadm.node_login(iface.name, portalStr, targetName) File "/usr/share/vdsm/storage/iscsiadm.py", line 295, in node_login raise IscsiNodeError(rc, out, err) IscsiNodeError: (8, ['Logging in to [iface: eth0.1, target: iqn.2008-05.com.xtremio:001e675b8ee1, portal: 10.35.160.3,3260] (multiple)'], ['iscsiadm: Could not login to [iface: eth0.1, target: iqn.2008-05.com.xtremio:001e675b8ee1, portal: 10.35.160.3,3260].', 'iscsiadm: initiator reported error (8 - connection timed out)', 'iscsiadm: Could not log into all portals'])
Version-Release number of selected component (if applicable): AV7 vdsm-4.14.7-0.1.beta3.el6ev.x86_64 rhevm-3.4.0-0.15.beta3.el6ev.noarch
The fix should now add a log indication the following: "Could not connect Host {hostName} - {hostId} to Iscsi Storage Server." Following the error got from VDSM. The operation of editing the network should eventually succeed even though we got exception from VDSM in the middle.
Since this bug was fixed but I'm unable to get the iscsiNodeError in vdsm as part of network replacement inside the iscsi bond, which was trigger the internal engine error, I'm closing this bug as UPSTREAM