Bug 798635
Summary: | 3.1 - getVGInfo returns with partial luns list on domains with more than one lun which causes hsm's to fail in ConnectStorageServer | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Dafna Ron <dron> | ||||
Component: | vdsm | Assignee: | Eduardo Warszawski <ewarszaw> | ||||
Status: | CLOSED ERRATA | QA Contact: | Daniel Paikov <dpaikov> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.2 | CC: | abaron, aburden, ashoham, bazulay, bsanford, cpelland, danken, dpaikov, hateya, iheim, ilvovsky, jbiddle, lyarwood, mgoldboi, mkrcmari, pvine, ykaul | ||||
Target Milestone: | rc | Keywords: | Regression, ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | vdsm-4.9.6-41.0 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, the getVGInfo call would only show a partial list of LUNs when adding storage domains consisting of more than one LUN. Subsequently the HSM would only log into the LUNs returned and HSM hosts became non-operational. Now, storage domains are added correctly and the LUNs are always returned.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 896505 (view as bug list) | Environment: | |||||
Last Closed: | 2012-12-04 18:54:24 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 788096, 836663 | ||||||
Bug Blocks: | 896505 | ||||||
Attachments: |
|
the following script reproduces the problem without engine: in order to run it, just replace lun list in createVG command with your (make sure multipath recognizes it) vdsClient 0 createVG d55e3a64-0615-47ba-a500-4d81cacff52c 1avihai_lun1-700G41337842,1avihai_lun2-700G41337843,1avihai_newdc_lun1-800G41 && sleep 5 && vdsClient 0 createStorageDomain 3 d55e3a64-0615-47ba-a500-4d81cacff52c avihai-domain `vgs d55e3a64-0615-47ba-a500-4d81cacff52c -o+uuid |awk '!/VG/{print $8}'` 1 2 && sleep 5 && vdsClient 0 getVGInfo `vgs d55e3a64-0615-47ba-a500-4d81cacff52c -o+uuid |awk '!/VG/{print $8}'` 1JOAgI-25y9-BdbI-Zfm1-NaJM-UQYi-XOtxxk Haim and Eduardo investigated this issue further and it seems like this is cause because of lvm cache issue described in bug https://bugzilla.redhat.com/show_bug.cgi?id=788096 However, we need a workaround from vdsm for this issue until lvm bug is fixed since currently users will not be able to create vg's from multiple targets on our hosts without a manual login to the luns. I just installed ic155.1 on a RHEL 6.2 GA server. I added two hosts. I then added ISCSI storage and as soon as the SPM was determined, the other host went non-operational. On both hosts are installed vdsm-4.9-112.6.el6_2.x86_64. Also the message in RHEV-M was "Host <hostname> cannot access one of the Storage Domains attached to it, or the Data Center object. Setting the Host state to Non-Operational. *** Bug 836720 has been marked as a duplicate of this bug. *** In getVGInfo the output of pvs is parsed. This output can be inconsistent due to lvm BZ#836663. When a device list is passed as an argument for pvs, and not all the devices have metadata in use (as in vdsm), the device order changes the pvs output. (See below.) Possible solutions: 1) Require an lvm version including BZ# 836663 fix for use with vdsm. 2) Select the pv with the mda in use as the 1st parameter in vdsm pvs cmd. 2) Parse the output of vgs -o +pv_name instead of the pvs response. [root@derez ~]# vgs -o pv_name 07622fad-381b-4b1d-b534-f9db364032c2 PV /dev/mapper/3600144f09dbd050000004e1994c60005 /dev/mapper/3600144f09dbd050000004ddbe989001b [root@derez ~]# [root@derez ~]# pvs -o pv_name,vg_name,pv_attr,pv_mda_used_count /dev/mapper/3600144f09dbd050000004e1994c60005 /dev/mapper/3600144f09dbd050000004ddbe989001b PV VG Attr #PMdaUse /dev/mapper/3600144f09dbd050000004ddbe989001b 07622fad-381b-4b1d-b534-f9db364032c2 a-- 0 /dev/mapper/3600144f09dbd050000004e1994c60005 07622fad-381b-4b1d-b534-f9db364032c2 a-- 2 [root@derez ~]# [root@derez ~]# pvs -o pv_name,vg_name,pv_attr,pv_mda_used_count /dev/mapper/3600144f09dbd050000004ddbe989001b /dev/mapper/3600144f09dbd050000004e1994c60005 PV VG Attr #PMdaUse /dev/mapper/3600144f09dbd050000004ddbe989001b a-- 0 /dev/mapper/3600144f09dbd050000004e1994c60005 07622fad-381b-4b1d-b534-f9db364032c2 a-- 2 Change-Id: I9fdf09a2637f096201f094918654e3f52663bc2d Checked on si24. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1508.html *** Bug 896505 has been marked as a duplicate of this bug. *** |
Created attachment 566546 [details] logs Description of problem: when adding domains consisting of more then one lun, backend will send getVGInfo and results of query will show partial list of the luns. as a result, backend will only send ConnectStorageServer to the luns returned and hsm will only log in to the luns returned. the hsm hosts will become non-operational since not all of the luns were connected to. Version-Release number of selected component (if applicable): vdsm-4.9-112.7.el6_2.x86_64 lvm2-2.02.87-6.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. in a two host cluster, create and attach a new domain consisting of several luns 2. 3. Actual results: getVGInfo will get a partial list of the luns and we will not log in to all luns on hsm. host will become non-operational Expected results: getVGInfo should get the list of all luns Additional info: full logs attached