Created attachment 854570 [details] Screenshots Description of problem: I didn't find "Red Hat Enterprise Virtualization Manager" Product anymore so i open to RHEL. Version-Release number of selected component (if applicable): 1 x HP EVA6400 1 x RHEV 3.2.5-0.49.el6ev 2 x RHEV Hypervisor - 6.4 - 20131016.0.el6 How reproducible: Present a new LUN to both hypervisors. Manager failed to extend storage domain because one of then did not recognize the disk. Steps to Reproduce: 1. Create a new VDISK and present to hypervisors 2. Go to Manager and edit storage domain 3. Try to add the new LUN that is visible if it found on SPM Actual results: Manager reports failed with this messages in log /var/log/ovirt/engine.log: 2014-01-23 15:48:34,972 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-3) START, GetDeviceListVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, storageType=FCP), log id: 15d00fc6 2014-01-23 15:48:34,988 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-10) START, GetDeviceListVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, storageType=FCP), log id: ebb9b5a 2014-01-23 15:48:40,879 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-3) FINISH, GetDeviceListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.LUNs@254bbddc, org.ovirt.engine.core.common.businessentities.LUNs@da8b5a05, org.ovirt.engine.core.common.businessentities.LUNs@af450e5c, org.ovirt.engine.core.common.businessentities.LUNs@6906a9f7, org.ovirt.engine.core.common.businessentities.LUNs@85edd626, org.ovirt.engine.core.common.businessentities.LUNs@d8fa742c], log id: 15d00fc6 2014-01-23 15:48:44,110 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-10) FINISH, GetDeviceListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.LUNs@254bbddc, org.ovirt.engine.core.common.businessentities.LUNs@da8b5a05, org.ovirt.engine.core.common.businessentities.LUNs@af450e5c, org.ovirt.engine.core.common.businessentities.LUNs@6906a9f7, org.ovirt.engine.core.common.businessentities.LUNs@85edd626, org.ovirt.engine.core.common.businessentities.LUNs@d8fa742c], log id: ebb9b5a 2014-01-23 15:49:00,346 INFO [org.ovirt.engine.core.bll.storage.UpdateStorageDomainCommand] (ajp-/127.0.0.1:8702-3) [5f1b5db5] Running command: UpdateStorageDomainCommand internal: false. Entities affected : ID: 5237a82e-d5a0-4d41-9ee0-49398fdc946b Type: Storage 2014-01-23 15:49:00,607 INFO [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] Running command: ConnectAllHostsToLunCommand internal: true. Entities affected : ID: 5237a82e-d5a0-4d41-9ee0-49398fdc946b Type: Storage 2014-01-23 15:49:00,761 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] START, GetDeviceListVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, storageType=FCP), log id: 5723375b 2014-01-23 15:49:07,265 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDeviceListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.LUNs@254bbddc, org.ovirt.engine.core.common.businessentities.LUNs@da8b5a05, org.ovirt.engine.core.common.businessentities.LUNs@af450e5c, org.ovirt.engine.core.common.businessentities.LUNs@6906a9f7, org.ovirt.engine.core.common.businessentities.LUNs@85edd626, org.ovirt.engine.core.common.businessentities.LUNs@d8fa742c], log id: 5723375b 2014-01-23 15:49:07,383 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] START, GetDevicesVisibilityVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, devicesIds=[36001438005deac11000070000af80000]), log id: 254e7ec7 2014-01-23 15:49:07,488 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDevicesVisibilityVDSCommand, return: {36001438005deac11000070000af80000=true}, log id: 254e7ec7 2014-01-23 15:49:07,529 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] START, GetDevicesVisibilityVDSCommand(HostName = rhev02.dms.local, HostId = 445f76dd-5fbd-4833-81ef-7f7f2f42a65e, devicesIds=[36001438005deac11000070000af80000]), log id: 2a31a68c 2014-01-23 15:49:07,545 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDevicesVisibilityVDSCommand, return: {36001438005deac11000070000af80000=false}, log id: 2a31a68c 2014-01-23 15:49:07,550 ERROR [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand. 2014-01-23 15:49:07,554 WARN [org.ovirt.engine.core.bll.storage.ExtendSANStorageDomainCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] CanDoAction of action ExtendSANStorageDomain failed. Reasons:VAR__TYPE__STORAGE__DOMAIN,VAR__ACTION__EXTEND,ERROR_CANNOT_EXTEND_CONNECTION_FAILED,$lun Expected results: Storage domain extended. Additional info: We do not use the hypervisor 6.5 because it failed to update when bond+vlan interfaces was found on configuration. It becomes unresponsive.
(In reply to Renê Rinco from comment #0) ... > Additional info: > We do not use the hypervisor 6.5 because it failed to update when bond+vlan > interfaces was found on configuration. It becomes unresponsive. Hey, did you also open a bug for this issue?
Alon, can you read something from those errors?
Yes. I tried again with the new rhev-hypervisor6-6.5-20140112.0 but persisted. https://bugzilla.redhat.com/show_bug.cgi?id=1057301 Now i have my hypervisor down and unusable. I will install it again from the latest 6.4 img (it works).
Daniel, please take a look?
Please note that failing extend when a LUN is not visible from all hosts in the DC is the expected behaviour (since otherwise that host would become unusuable and VMs running on it might pause due to their disks being extended on the newly added LUN while the host has no access to it). What needs to be determined here is why the second hosts was not able to 'see' the LUN.
Hi Renê, * Can you please attach full vdsm and engine logs? * Have you re-installed the hypervisor from 6.4 img as mentioned in comment 4?
Created attachment 864178 [details] vdsm.log Hi Daniel, I can't attach the vdsm log from the moment of the problem occurred because i already re-installed my hypervisor with 6.4 (6.4-20131016.0.el6). I am attaching the vdsm.log regarding post installation . The relevant part of engine.log was attached as text in the comment 1.
Hi Rene, Can you please reproduce and attach full VDSM + Engine + sanlock logs?
Closing as there is just not enough info here. Please reopen if you can provide logs. Thanks.
Created attachment 884039 [details] engine.log
Created attachment 884040 [details] sanlock.log RHEV01
Created attachment 884041 [details] sanlock.log RHEV02
Created attachment 884043 [details] vdsm.log RHEV01
Created attachment 884045 [details] vdsm.log RHEV02
Hi there! I hit this bug again and now a attached the requested files.
Hi Rene, A couple of questions to understand where the problem is laid in: * According to the logs, it seems the the selected LUN [1], is visible by host 'rhev01.dms.local' (according to [2]). Can you please check manually (using multipath -ll) its visibility form the other host ('rhev02.dms.local') - as it failed according to [3]. * Has that LUN been added after creating the specified storage domain? (i.e. it might already been resolved by bug 1071654) * Do you encounter the same behavior on newer builds as well (3.4/3.5)? [1] 36001438005deac11000070000af80000 [2] /127.0.0.1:8702-9) [7e80b072] START, GetDevicesVisibilityVDSCommand(HostName = rhev01.dms.local, HostId = e8425b25-1019-430a-8a94-9e8fe7dcf711, devicesIds=[36001438005deac11000070000af80000]), log id: 254e7ec7 [3] [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-9) [7e80b072] FINISH, GetDevicesVisibilityVDSCommand, return: {36001438005deac11000070000af80000=false}, log id: 2a31a68c
Hi Daniel, 1) It's true. After i presented the LUN to both hypervisors only rhev01 see it, i also confirmed that with multipath -ll. 2) Yes. I was editing the storage domain. This Bug 1071654 reflects exactly what i did. After rescaning bus everything went back to normal. 3) I don't have newer build on our environment. We are using build 3.3, where the problem happens too.
Thanks Rene. Adding bug 1071654 as 'Depends On'. The fix should be available in 3.5 (and next build of 3.4 as part of bug 1123637).
I created a new LUN and was able to extend the storage domain successfully. Moving to Verified.
I have checked the scenario on both SCSI and FC configurations as follows: 1. Created a new LUN via the storage server and mapped it to both hosts 2. Edited the Storage Domain and added the additional LUN >>>> the Storage Domain was successfully extended. This bug is now verified on both SCSI and FC configurations
Nir, since you add the "requires-doctext?" flag, could you add a couple of words for the docs team about this bug? Thanks!
This looks like a duplicate and the issue is explained in the dependent bugs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0159.html