Bug 1057284 - Error while execution action: Cannot extend Storage Domain. Storage device is unreachable from ${hostname}
Summary: Error while execution action: Cannot extend Storage Domain. Storage device is...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.2.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.5.0
Assignee: Nir Soffer
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard: storage
Depends On: 1071654
Blocks: rhev3.5beta 1156165
TreeView+ depends on / blocked
 
Reported: 2014-01-23 18:38 UTC by Renê Rinco
Modified: 2019-04-28 09:40 UTC (History)
27 users (show)

Fixed In Version: v4.16.2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-11 21:10:09 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshots (412.38 KB, application/x-zip-compressed)
2014-01-23 18:38 UTC, Renê Rinco
no flags Details
vdsm.log (777.61 KB, application/octet-stream)
2014-02-17 16:25 UTC, Renê Rinco
no flags Details
engine.log (2.35 MB, text/plain)
2014-04-08 12:42 UTC, Renê Rinco
no flags Details
sanlock.log RHEV01 (799 bytes, text/plain)
2014-04-08 12:43 UTC, Renê Rinco
no flags Details
sanlock.log RHEV02 (1.97 KB, text/plain)
2014-04-08 12:44 UTC, Renê Rinco
no flags Details
vdsm.log RHEV01 (8.06 MB, text/plain)
2014-04-08 12:49 UTC, Renê Rinco
no flags Details
vdsm.log RHEV02 (7.93 MB, text/plain)
2014-04-08 12:55 UTC, Renê Rinco
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 637763 0 None None None Never
Red Hat Product Errata RHBA-2015:0159 0 normal SHIPPED_LIVE vdsm 3.5.0 - bug fix and enhancement update 2015-02-12 01:35:58 UTC
oVirt gerrit 27122 0 None MERGED multipath: Rescan also FC devices 2021-02-21 18:25:52 UTC
oVirt gerrit 30984 0 None MERGED multipath: Rescan also FC devices 2021-02-21 18:25:51 UTC

Description Renê Rinco 2014-01-23 18:38:49 UTC
Created attachment 854570 [details]
Screenshots

Description of problem:
I didn't find "Red Hat Enterprise Virtualization Manager" Product anymore so i open to RHEL.

Version-Release number of selected component (if applicable):
1 x HP EVA6400
1 x RHEV 3.2.5-0.49.el6ev 
2 x RHEV Hypervisor - 6.4 - 20131016.0.el6

How reproducible:
Present a new LUN to both hypervisors. Manager failed to extend storage domain because one of then did not recognize the disk.

Steps to Reproduce:
1. Create a new VDISK and present to hypervisors
2. Go to Manager and edit storage domain
3. Try to add the new LUN that is visible if it found on SPM

Actual results:
Manager reports failed with this messages in log /var/log/ovirt/engine.log:
2014-01-23 15:48:34,972 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-3) START, GetDeviceListVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, storageType=FCP), log id: 15d00fc6
2014-01-23 15:48:34,988 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-10) START, GetDeviceListVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, storageType=FCP), log id: ebb9b5a
2014-01-23 15:48:40,879 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-3) FINISH, GetDeviceListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.LUNs@254bbddc, org.ovirt.engine.core.common.businessentities.LUNs@da8b5a05, org.ovirt.engine.core.common.businessentities.LUNs@af450e5c, org.ovirt.engine.core.common.businessentities.LUNs@6906a9f7, org.ovirt.engine.core.common.businessentities.LUNs@85edd626, org.ovirt.engine.core.common.businessentities.LUNs@d8fa742c], log id: 15d00fc6
2014-01-23 15:48:44,110 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-10) FINISH, GetDeviceListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.LUNs@254bbddc, org.ovirt.engine.core.common.businessentities.LUNs@da8b5a05, org.ovirt.engine.core.common.businessentities.LUNs@af450e5c, org.ovirt.engine.core.common.businessentities.LUNs@6906a9f7, org.ovirt.engine.core.common.businessentities.LUNs@85edd626, org.ovirt.engine.core.common.businessentities.LUNs@d8fa742c], log id: ebb9b5a
2014-01-23 15:49:00,346 INFO  [org.ovirt.engine.core.bll.storage.UpdateStorageDomainCommand] (ajp-/127.0.0.1:8702-3) [5f1b5db5] Running command: UpdateStorageDomainCommand internal: false. Entities affected :  ID: 5237a82e-d5a0-4d41-9ee0-49398fdc946b Type: Storage
2014-01-23 15:49:00,607 INFO  [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] Running command: ConnectAllHostsToLunCommand internal: true. Entities affected :  ID: 5237a82e-d5a0-4d41-9ee0-49398fdc946b Type: Storage
2014-01-23 15:49:00,761 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] START, GetDeviceListVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, storageType=FCP), log id: 5723375b
2014-01-23 15:49:07,265 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDeviceListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.LUNs@254bbddc, org.ovirt.engine.core.common.businessentities.LUNs@da8b5a05, org.ovirt.engine.core.common.businessentities.LUNs@af450e5c, org.ovirt.engine.core.common.businessentities.LUNs@6906a9f7, org.ovirt.engine.core.common.businessentities.LUNs@85edd626, org.ovirt.engine.core.common.businessentities.LUNs@d8fa742c], log id: 5723375b
2014-01-23 15:49:07,383 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] START, GetDevicesVisibilityVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, devicesIds=[36001438005deac11000070000af80000]), log id: 254e7ec7
2014-01-23 15:49:07,488 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDevicesVisibilityVDSCommand, return: {36001438005deac11000070000af80000=true}, log id: 254e7ec7
2014-01-23 15:49:07,529 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] START, GetDevicesVisibilityVDSCommand(HostName = rhev02.dms.local, HostId = 445f76dd-5fbd-4833-81ef-7f7f2f42a65e, devicesIds=[36001438005deac11000070000af80000]), log id: 2a31a68c
2014-01-23 15:49:07,545 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDevicesVisibilityVDSCommand, return: {36001438005deac11000070000af80000=false}, log id: 2a31a68c
2014-01-23 15:49:07,550 ERROR [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand.
2014-01-23 15:49:07,554 WARN  [org.ovirt.engine.core.bll.storage.ExtendSANStorageDomainCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] CanDoAction of action ExtendSANStorageDomain failed. Reasons:VAR__TYPE__STORAGE__DOMAIN,VAR__ACTION__EXTEND,ERROR_CANNOT_EXTEND_CONNECTION_FAILED,$lun

Expected results:
Storage domain extended.

Additional info:
We do not use the hypervisor 6.5 because it failed to update when bond+vlan interfaces was found on configuration. It becomes unresponsive.

Comment 2 Fabian Deutsch 2014-01-23 19:11:05 UTC
(In reply to Renê Rinco from comment #0)
...
> Additional info:
> We do not use the hypervisor 6.5 because it failed to update when bond+vlan
> interfaces was found on configuration. It becomes unresponsive.

Hey,

did you also open a bug for this issue?

Comment 3 Fabian Deutsch 2014-01-23 19:12:13 UTC
Alon,

can you read something from those errors?

Comment 4 Renê Rinco 2014-01-23 19:39:22 UTC
Yes. I tried again with the new rhev-hypervisor6-6.5-20140112.0 but persisted. 
https://bugzilla.redhat.com/show_bug.cgi?id=1057301 

Now i have my hypervisor down and unusable. I will install it again from the latest 6.4 img (it works).

Comment 5 Allon Mureinik 2014-01-27 15:38:25 UTC
Daniel, please take a look?

Comment 6 Ayal Baron 2014-02-16 09:19:41 UTC
Please note that failing extend when a LUN is not visible from all hosts in the DC is the expected behaviour (since otherwise that host would become unusuable and VMs running on it might pause due to their disks being extended on the newly added LUN while the host has no access to it).
What needs to be determined here is why the second hosts was not able to 'see' the LUN.

Comment 7 Daniel Erez 2014-02-16 18:14:35 UTC
Hi Renê,

* Can you please attach full vdsm and engine logs?
* Have you re-installed the hypervisor from 6.4 img as mentioned in comment 4?

Comment 8 Renê Rinco 2014-02-17 16:25:48 UTC
Created attachment 864178 [details]
vdsm.log

Hi Daniel,

I can't attach the vdsm log from the moment of the problem occurred because i already re-installed my hypervisor with 6.4 (6.4-20131016.0.el6). I am attaching the vdsm.log regarding post installation . The relevant part of engine.log was attached as text in the comment 1.

Comment 11 Tal Nisan 2014-02-26 09:55:28 UTC
Hi Rene,
Can you please reproduce and attach full VDSM + Engine + sanlock logs?

Comment 12 Ayal Baron 2014-02-26 16:11:32 UTC
Closing as there is just not enough info here.

Please reopen if you can provide logs.

Thanks.

Comment 13 Renê Rinco 2014-04-08 12:42:40 UTC
Created attachment 884039 [details]
engine.log

Comment 14 Renê Rinco 2014-04-08 12:43:52 UTC
Created attachment 884040 [details]
sanlock.log RHEV01

Comment 15 Renê Rinco 2014-04-08 12:44:27 UTC
Created attachment 884041 [details]
sanlock.log RHEV02

Comment 16 Renê Rinco 2014-04-08 12:49:59 UTC
Created attachment 884043 [details]
vdsm.log RHEV01

Comment 17 Renê Rinco 2014-04-08 12:55:11 UTC
Created attachment 884045 [details]
vdsm.log RHEV02

Comment 18 Renê Rinco 2014-04-08 13:00:59 UTC
Hi there! I hit this bug again and now a attached the requested files.

Comment 19 Daniel Erez 2014-07-29 12:27:33 UTC
Hi Rene,

A couple of questions to understand where the problem is laid in:
* According to the logs, it seems the the selected LUN [1], is visible by host
'rhev01.dms.local' (according to [2]). Can you please check manually (using multipath -ll) its visibility form the other host ('rhev02.dms.local') - as it failed according to [3].
* Has that LUN been added after creating the specified storage domain?
(i.e. it might already been resolved by bug 1071654)
* Do you encounter the same behavior on newer builds as well (3.4/3.5)?

[1] 36001438005deac11000070000af80000

[2]
/127.0.0.1:8702-9) [7e80b072] START, GetDevicesVisibilityVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, devicesIds=[36001438005deac11000070000af80000]), log id: 254e7ec7

[3]
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDevicesVisibilityVDSCommand, return: {36001438005deac11000070000af80000=false}, log id: 2a31a68c

Comment 20 Renê Rinco 2014-07-29 17:17:16 UTC
Hi Daniel,

1) It's true. After i presented the LUN to both hypervisors only rhev01 see it, i also confirmed that with multipath -ll.
2) Yes. I was editing the storage domain. This Bug 1071654 reflects exactly what i did. After rescaning bus everything went back to normal.
3) I don't have newer build on our environment. We are using build 3.3, where the problem happens too.

Comment 21 Daniel Erez 2014-07-29 17:32:38 UTC
Thanks Rene.
Adding bug 1071654 as 'Depends On'. The fix should be available in 3.5 (and next build of 3.4 as part of bug 1123637).

Comment 28 Kevin Alon Goldblatt 2014-09-16 12:31:10 UTC
I created a new LUN and was able to extend the storage domain successfully. Moving to Verified.

Comment 29 Kevin Alon Goldblatt 2014-09-16 13:28:38 UTC
I have checked the scenario on both SCSI and FC configurations as follows:

1. Created a new LUN via the storage server and mapped it to both hosts
2. Edited the Storage Domain and added the additional LUN >>>> the Storage Domain was successfully extended.

This bug is now verified on both SCSI and FC configurations

Comment 30 Allon Mureinik 2014-11-07 09:03:05 UTC
Nir, since you add the "requires-doctext?" flag, could you add a couple of words for the docs team about this bug?
Thanks!

Comment 31 Nir Soffer 2014-12-15 12:59:21 UTC
This looks like a duplicate and the issue is explained in the dependent bugs.

Comment 33 errata-xmlrpc 2015-02-11 21:10:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0159.html


Note You need to log in before you can comment on or make changes to this bug.