Bug 720981 - fail to create a SAN data domain(FC/iSCSI) in rhevm 3.0
Summary: fail to create a SAN data domain(FC/iSCSI) in rhevm 3.0
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Saggi Mizrahi
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-13 12:42 UTC by cshao
Modified: 2016-04-18 06:41 UTC (History)
16 users (show)

Fixed In Version: vdsm-4.9-86
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 07:31:26 UTC
Target Upstream Version:


Attachments (Terms of Use)
iscsi storage (19.41 KB, image/jpeg)
2011-07-13 12:42 UTC, cshao
no flags Details
vdsm.log (2.00 MB, text/plain)
2011-07-13 12:44 UTC, cshao
no flags Details
rhevm.log (702.20 KB, text/plain)
2011-07-13 13:16 UTC, cshao
no flags Details
vdsm.log on rhel6 (1.05 MB, text/plain)
2011-07-18 11:39 UTC, Guohua Ouyang
no flags Details
vdsm.log-0725 (4.76 KB, text/plain)
2011-07-25 13:04 UTC, Guohua Ouyang
no flags Details
dm-error show in rhev-h console after show up in rhevm. (118.54 KB, image/png)
2011-07-25 13:12 UTC, Guohua Ouyang
no flags Details
rhevm.log-0727 (3.81 KB, text/plain)
2011-07-27 08:06 UTC, Guohua Ouyang
no flags Details
vdsm.log-0727 (20.23 KB, text/plain)
2011-07-27 08:06 UTC, Guohua Ouyang
no flags Details
add storage fail screen (130.52 KB, image/png)
2011-07-27 08:07 UTC, Guohua Ouyang
no flags Details
rhevm.log-0728 (6.01 KB, text/x-log)
2011-07-28 10:18 UTC, Guohua Ouyang
no flags Details
vdsm.log-0728 (9.11 KB, text/plain)
2011-07-28 10:18 UTC, Guohua Ouyang
no flags Details
screenshot on fc (129.53 KB, image/png)
2011-07-28 10:21 UTC, Guohua Ouyang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:1782 0 normal SHIPPED_LIVE new packages: vdsm 2011-12-06 11:55:51 UTC

Description cshao 2011-07-13 12:42:37 UTC
Created attachment 512654 [details]
iscsi storage

Description of problem:
Can't connect to iSCSI storage in rhevm 3.0,rhev-m will display search page always.

Version-Release number of selected component (if applicable):
rhevm-3.0.0_0001-9.x86_64
rhev-hypervisor-6.2-0.5.el6


How reproducible:
100%

Steps to Reproduce:
1. Register rhevh to rhevm.
2. Rhev-H: Setting iSCSI Initiator name in /etc/iscsi/initiatorname.iscsi.
3. RHEV-M: Create New domain, storage type choose iSCSI.
4. Please see log file.
  
Actual results:
Rehv-M can't connect to iSCSI storage, rhev-m will display search page always.


Expected results:
Rhev-M can connect iSCSI storage successful.

Additional info:
Below is my test machine info:
                                    
[root@localhost ~]# pvs
  Found duplicate PV e7kARgwYKz6NEP50vxdBIZG2nMpJvg82: using /dev/mapper/1ATA_Hitachi_HDT721032SLA380_STA2L7MT1ZZRKBp3 not /dev/mapper/1ATA     Hitachi HDT721032SLA380                       STA2L7MT1ZZRKBp3
  PV                                                        VG     Fmt  Attr PSize   PFree  
  /dev/mapper/1ATA_Hitachi_HDT721032SLA380_STA2L7MT1ZZRKBp3 HostVG lvm2 a-   297.61g 295.07g
[root@localhost ~]# vgs
  Found duplicate PV e7kARgwYKz6NEP50vxdBIZG2nMpJvg82: using /dev/mapper/1ATA_Hitachi_HDT721032SLA380_STA2L7MT1ZZRKBp3 not /dev/mapper/1ATA     Hitachi HDT721032SLA380                       STA2L7MT1ZZRKBp3
  VG     #PV #LV #SN Attr   VSize   VFree  
  HostVG   1   4   0 wz--n- 297.61g 295.07g
[root@localhost ~]# 


#cat /var/log/vdsm/vdsm.log
..............

Thread-133::DEBUG::2011-07-13 12:01:16,952::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-133::DEBUG::2011-07-13 12:01:16,952::task::492::TaskManager.Task::(_debug) Task 8da5eb80-1824-4a5f-b5f5-57cd94b32d5d: moving from state preparing -> state aborting
Thread-133::DEBUG::2011-07-13 12:01:16,953::task::492::TaskManager.Task::(_debug) Task 8da5eb80-1824-4a5f-b5f5-57cd94b32d5d: _aborting: recover policy none
Thread-133::DEBUG::2011-07-13 12:01:16,954::task::492::TaskManager.Task::(_debug) Task 8da5eb80-1824-4a5f-b5f5-57cd94b32d5d: moving from state aborting -> state failed
Thread-133::DEBUG::2011-07-13 12:01:16,954::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-133::DEBUG::2011-07-13 12:01:16,955::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-133::ERROR::2011-07-13 12:01:16,956::dispatcher::106::Storage.Dispatcher.Protect::(run) [Errno 2] No such file or directory: '/sys/block/1ATA     Hitachi HDT721032SLA380                       STA2L7MT1ZZRKB/queue/logical_block_size'
Thread-133::ERROR::2011-07-13 12:01:16,956::dispatcher::107::Storage.Dispatcher.Protect::(run) Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 96, in run
  File "/usr/share/vdsm/storage/task.py", line 1184, in prepare
IOError: [Errno 2] No such file or directory: '/sys/block/1ATA     Hitachi HDT721032SLA380                       STA2L7MT1ZZRKB/queue/logical_block_size'

Thread-134::INFO::2011-07-13 12:01:19,062::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: repoStats, args: ()
Thread-134::DEBUG::2011-07-13 12:01:19,062::task::492::TaskManager.Task::(_debug) Task 817e43a7-717a-4731-9122-c18c8f164cb8: moving from state init -> state preparing
Thread-134::DEBUG::2011-07-13 12:01:19,063::task::492::TaskManager.Task::(_debug) Task 817e43a7-717a-4731-9122-c18c8f164cb8: finished: {}
..................

Comment 2 cshao 2011-07-13 12:44:21 UTC
Created attachment 512655 [details]
vdsm.log

Comment 6 cshao 2011-07-13 13:16:40 UTC
Created attachment 512660 [details]
rhevm.log

Comment 8 Saggi Mizrahi 2011-07-14 07:56:01 UTC
you have to reboot after changing the initiator name, if it rhev-h don't forget to persist the file.

Comment 9 cshao 2011-07-14 09:22:50 UTC
(In reply to comment #8)
> you have to reboot after changing the initiator name, if it rhev-h don't forget
> to persist the file.

After reboot host, the initiator name exist.

Comment 12 Saggi Mizrahi 2011-07-18 12:09:54 UTC
http://gerrit.usersys.redhat.com/717

Comment 17 Guohua Ouyang 2011-07-25 11:59:47 UTC
Tested on rhev-hypervisor-6.2-0.5.el6, failed to add iSCSI storage, it report "Cannot create Volume Group, error code 502"

rhevm.log:

2011-07-26 03:04:42,675 INFO  [org.nogah.bll.storage.AddSANStorageDomainCommand] (http-0.0.0.0-8443-3) Running command: AddSANStorageDomainCommand internal: false. Entities affected :  ID: aaa00000-0000-0000-0000-123456789aaa Type: System
2011-07-26 03:04:42,725 INFO  [org.nogah.vdsbroker.vdsbroker.CreateVGVDSCommand] (http-0.0.0.0-8443-3) START, CreateVGVDSCommand(vdsId = bf15ed7a-b6ed-11e0-9bb9-3b622d592ae8, storageDomainId=26176e77-b217-4bab-ae39-d4ada6872f3e, deviceList=[360a9800050334c33424a627938317274]), log id: 7ede5787
2011-07-26 03:04:42,741 ERROR [org.nogah.vdsbroker.vdsbroker.BrokerCommandBase] (http-0.0.0.0-8443-3) Failed in CreateVGVDS method
2011-07-26 03:04:42,742 ERROR [org.nogah.vdsbroker.vdsbroker.BrokerCommandBase] (http-0.0.0.0-8443-3) Error code VolumeGroupCreateError and error message VDSGenericException: VDSErrorException: Failed to CreateVGVDS, error = Cannot create Volume Group: "vgname=26176e77-b217-4bab-ae39-d4ada6872f3e, devname=['360a9800050334c33424a627938317274']"
2011-07-26 03:04:42,743 INFO  [org.nogah.vdsbroker.vdsbroker.BrokerCommandBase] (http-0.0.0.0-8443-3) Command org.nogah.vdsbroker.vdsbroker.CreateVGVDSCommand return value
 Class Name: org.nogah.vdsbroker.irsbroker.OneUuidReturnForXmlRpc
mUuid                         Null
mStatus                       Class Name: org.nogah.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         502
mMessage                      Cannot create Volume Group: "vgname=26176e77-b217-4bab-ae39-d4ada6872f3e, devname=['360a9800050334c33424a627938317274']"


2011-07-26 03:04:42,743 INFO  [org.nogah.vdsbroker.vdsbroker.BrokerCommandBase] (http-0.0.0.0-8443-3) Vds: amd-1216-8-5.englab.nay.redhat.com
2011-07-26 03:04:42,743 ERROR [org.nogah.vdsbroker.VDSCommandBase] (http-0.0.0.0-8443-3) Command CreateVGVDS execution failed. Exception: VDSGenericException: VDSErrorException: Failed to CreateVGVDS, error = Cannot create Volume Group: "vgname=26176e77-b217-4bab-ae39-d4ada6872f3e, devname=['360a9800050334c33424a627938317274']"
2011-07-26 03:04:42,743 INFO  [org.nogah.vdsbroker.vdsbroker.CreateVGVDSCommand] (http-0.0.0.0-8443-3) FINISH, CreateVGVDSCommand, log id: 7ede5787
2011-07-26 03:04:42,743 ERROR [org.nogah.bll.storage.AddSANStorageDomainCommand] (http-0.0.0.0-8443-3) Command org.nogah.bll.storage.AddSANStorageDomainCommand throw Vdc Bll exception. With error message VdcBLLException: org.nogah.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateVGVDS, error = Cannot create Volume Group: "vgname=26176e77-b217-4bab-ae39-d4ada6872f3e, devname=['360a9800050334c33424a627938317274']"
2011-07-26 03:04:42,756 INFO  [org.nogah.bll.storage.AddSANStorageDomainCommand] (http-0.0.0.0-8443-3) Command [id=5fd0937a-c15d-40ce-9d3a-9b8aa036cef6]: Compensating NEW_ENTITY_ID of org.nogah.common.businessentities.storage_domain_dynamic; snapshot: 26176e77-b217-4bab-ae39-d4ada6872f3e.
2011-07-26 03:04:42,765 INFO  [org.nogah.bll.storage.AddSANStorageDomainCommand] (http-0.0.0.0-8443-3) Command [id=5fd0937a-c15d-40ce-9d3a-9b8aa036cef6]: Compensating NEW_ENTITY_ID of org.nogah.common.businessentities.storage_domain_static; snapshot: 26176e77-b217-4bab-ae39-d4ada6872f3e.
2011-07-26 03:04:42,815 ERROR [org.nogah.bll.storage.AddSANStorageDomainCommand] (http-0.0.0.0-8443-3) Transaction rolled-back for command: org.nogah.bll.storage.AddSANStorageDomainCommand.

Comment 19 Guohua Ouyang 2011-07-25 13:04:58 UTC
Created attachment 515040 [details]
vdsm.log-0725

Comment 20 Guohua Ouyang 2011-07-25 13:12:02 UTC
Created attachment 515041 [details]
dm-error show in rhev-h console after show up in rhevm.

Another thing is we have a multipath issue 725335 on this build (rhev-hypervisor-6.2-0.5.el6). After rhev-h show up in rhevm, there is a device mapper error on rhev-h console, see attachment.

Note: this error is only occurred after show up in rhevm, not ocuurred during add the SAN storage, just for the reference as it may related.

Comment 21 Omer Frenkel 2011-07-26 06:44:18 UTC
the errors mentioned in this bug (600, 502) are known to rhevm. also it looks that the error during GetDeviceListQuery is passed to the frontend.

Comment 22 Saggi Mizrahi 2011-07-26 07:38:56 UTC
I would like to see `multipath -ll` output from the host

Comment 23 Guohua Ouyang 2011-07-26 09:04:25 UTC
results on rhel6:
same with comment #10, the iscsi storage is added successfully and show green in rhevm IC133.

Comment 24 Guohua Ouyang 2011-07-26 09:22:13 UTC
(In reply to comment #22)
> I would like to see `multipath -ll` output from the host

[root@amd-1352-8-5 ~]# multipath -ll
1IET_00010001 dm-6 IET,VIRTUAL-DISK
size=20G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 7:0:0:1 sdc 8:32 active ready running
1ATA_WDC_WD2502ABYS-18B7A0_WD-WCAT19558392 dm-1 ATA,WDC WD2502ABYS-1
size=233G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 0:0:0:0 sda 8:0  active ready running

Comment 26 Saggi Mizrahi 2011-07-26 10:26:03 UTC
Well if you look at multipath's output you can clearly see that `360a9800050334c33424a627938317274` doesn't show up. Might be a lun masking issue. Check you storage targets configuration. If you changed the initiator name restart iscsid.

Comment 27 Guohua Ouyang 2011-07-26 10:45:49 UTC
(In reply to comment #26)
> Well if you look at multipath's output you can clearly see that
> `360a9800050334c33424a627938317274` doesn't show up. Might be a lun masking
> issue. Check you storage targets configuration. If you changed the initiator
> name restart iscsid.

I have restarted iscsid after change initiator name.

there is multipath bug 725335 on this build (comment #20), do you think it's the root cause? If so, I'd like to test this after 725335 is fixed.

Comment 28 Alan Pevec 2011-07-26 11:15:21 UTC
(In reply to comment #27)
> (In reply to comment #26)
> there is multipath bug 725335 on this build (comment #20), do you think it's
> the root cause? If so, I'd like to test this after 725335 is fixed.

Where was rhevh installed on? In bug 725335 disk where HostVG is created gets opened before multipath hence not showing up under multipath.

Comment 29 Saggi Mizrahi 2011-07-26 13:13:26 UTC
I didn't see the device or it's paths on the host. I think it's an FC configuration issue. In any case, untill the device doesn't appear under `multipalth -ll` it's out of VDSM's hand.

Comment 30 Guohua Ouyang 2011-07-27 02:47:00 UTC
(In reply to comment #28)
> (In reply to comment #27)
> > (In reply to comment #26)
> > there is multipath bug 725335 on this build (comment #20), do you think it's
> > the root cause? If so, I'd like to test this after 725335 is fixed.
> 
> Where was rhevh installed on? In bug 725335 disk where HostVG is created gets
> opened before multipath hence not showing up under multipath.

The rhevh was installed on iSCSI lun. And yes, this lun isn't showing up under multipath. I just double it may related to 725335.

Comment 33 Ayal Baron 2011-07-27 07:05:39 UTC
(In reply to comment #30)
> (In reply to comment #28)
> > (In reply to comment #27)
> > > (In reply to comment #26)
> > > there is multipath bug 725335 on this build (comment #20), do you think it's
> > > the root cause? If so, I'd like to test this after 725335 is fixed.
> > 
> > Where was rhevh installed on? In bug 725335 disk where HostVG is created gets
> > opened before multipath hence not showing up under multipath.
> 
> The rhevh was installed on iSCSI lun. And yes, this lun isn't showing up under
> multipath. I just double it may related to 725335.

So that would be a configuration problem.  Moving back to ON_QA.

Comment 34 Guohua Ouyang 2011-07-27 08:05:39 UTC
(In reply to comment #33)
> (In reply to comment #30)
> > (In reply to comment #28)
> > > (In reply to comment #27)
> > > > (In reply to comment #26)
> > > > there is multipath bug 725335 on this build (comment #20), do you think it's
> > > > the root cause? If so, I'd like to test this after 725335 is fixed.
> > > 
> > > Where was rhevh installed on? In bug 725335 disk where HostVG is created gets
> > > opened before multipath hence not showing up under multipath.
> > 
> > The rhevh was installed on iSCSI lun. And yes, this lun isn't showing up under
> > multipath. I just double it may related to 725335.
> 
> So that would be a configuration problem.  Moving back to ON_QA.

I'm sorry for the confusion. I just tested it on rhev-hypervisor 6.2-07, creata san storage failed.

1. # multipath -ll
3600a0b80005b0acc00008c204c7744ba dm-1 IBM,1726-4xx  FAStT
size=20G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| |- 5:0:2:1 sdd 8:48  active ready running
| `- 6:0:1:1 sdh 8:112 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 5:0:0:1 sdb 8:16  active ghost running
  `- 6:0:0:1 sdf 8:80  active ghost running
3600a0b80005b10ca00008e254c7726b1 dm-0 IBM,1726-4xx  FAStT
size=20G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| |- 5:0:0:0 sda 8:0   active ready running
| `- 6:0:0:0 sde 8:64  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 5:0:2:0 sdc 8:32  active ghost running
  `- 6:0:1:0 sdg 8:96  active ghost running

2. please refer rhevm.log and vdsm.log below.

Comment 35 Guohua Ouyang 2011-07-27 08:06:13 UTC
Created attachment 515438 [details]
rhevm.log-0727

Comment 36 Guohua Ouyang 2011-07-27 08:06:47 UTC
Created attachment 515439 [details]
vdsm.log-0727

Comment 37 Guohua Ouyang 2011-07-27 08:07:52 UTC
Created attachment 515440 [details]
add storage fail screen

Comment 38 Saggi Mizrahi 2011-07-27 10:09:33 UTC
Corner cases galore
http://gerrit.usersys.redhat.com/#change,751

Comment 39 Guohua Ouyang 2011-07-27 10:17:22 UTC
Tested Saggi's patch "devicemapper.py" (http://gerrit.usersys.redhat.com/#change,751), it works.  Now the SAN storage is added successfully and show up in rhevm.

Comment 40 Guohua Ouyang 2011-07-28 10:16:22 UTC
Tested on rhev-h 6.2-09, it failed to add iSCSI/FC storage, report the same error "cannot create volume group".

# multipath -ll
3600a0b80005b0acc00008c204c7744ba dm-1 IBM,1726-4xx  FAStT
size=20G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| |- 5:0:2:1 sdd 8:48  active ready running
| `- 6:0:1:1 sdh 8:112 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 5:0:0:1 sdb 8:16  active ghost running
  `- 6:0:0:1 sdf 8:80  active ghost running
3600a0b80005b10ca00008e254c7726b1 dm-0 IBM,1726-4xx  FAStT
size=20G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| |- 5:0:0:0 sda 8:0   active ready running
| `- 6:0:0:0 sde 8:64  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 5:0:2:0 sdc 8:32  active ghost running
  `- 6:0:1:0 sdg 8:96  active ghost running

Comment 41 Guohua Ouyang 2011-07-28 10:18:28 UTC
Created attachment 515679 [details]
rhevm.log-0728

Comment 42 Guohua Ouyang 2011-07-28 10:18:56 UTC
Created attachment 515680 [details]
vdsm.log-0728

on FC, it failed to scan the lun 3600a0b80005b10ca00008e254c7726b1 on rhevm UI, but only 3600a0b80005b0acc00008c204c7744ba which have rhev-h installed.

Comment 43 Guohua Ouyang 2011-07-28 10:21:03 UTC
Created attachment 515681 [details]
screenshot on fc

Comment 44 Guohua Ouyang 2011-07-28 14:06:44 UTC
the device already existed in DB table "luns", remove them and re-tested, add iSCSI/FC storage successfully, and can show up in storage.  

version: rhev-h 6.2-09.

Change the status to verified.

Comment 45 errata-xmlrpc 2011-12-06 07:31:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html


Note You need to log in before you can comment on or make changes to this bug.