RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 798635 - 3.1 - getVGInfo returns with partial luns list on domains with more than one lun which causes hsm's to fail in ConnectStorageServer
Summary: 3.1 - getVGInfo returns with partial luns list on domains with more than one ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.2
Hardware: x86_64
OS: Linux
medium
urgent
Target Milestone: rc
: ---
Assignee: Eduardo Warszawski
QA Contact: Daniel Paikov
URL:
Whiteboard: storage
: 836720 896505 (view as bug list)
Depends On: 788096 836663
Blocks: 896505
TreeView+ depends on / blocked
 
Reported: 2012-02-29 13:07 UTC by Dafna Ron
Modified: 2022-07-09 05:34 UTC (History)
17 users (show)

Fixed In Version: vdsm-4.9.6-41.0
Doc Type: Bug Fix
Doc Text:
Previously, the getVGInfo call would only show a partial list of LUNs when adding storage domains consisting of more than one LUN. Subsequently the HSM would only log into the LUNs returned and HSM hosts became non-operational. Now, storage domains are added correctly and the LUNs are always returned.
Clone Of:
: 896505 (view as bug list)
Environment:
Last Closed: 2012-12-04 18:54:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (1.58 MB, application/x-gzip)
2012-02-29 13:07 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 788096 0 high CLOSED LVM: getting false report about faulty lun from pvs 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 129463 0 None None None 2018-11-28 20:37:02 UTC
Red Hat Product Errata RHSA-2012:1508 0 normal SHIPPED_LIVE Important: rhev-3.1.0 vdsm security, bug fix, and enhancement update 2012-12-04 23:48:05 UTC

Description Dafna Ron 2012-02-29 13:07:47 UTC
Created attachment 566546 [details]
logs

Description of problem:

when adding domains consisting of more then one lun, backend will send getVGInfo and results of query will show partial list of the luns. 
as a result, backend will only send ConnectStorageServer to the luns returned and hsm will only log in to the luns returned. 
the hsm hosts will become non-operational since not all of the luns were connected to. 

Version-Release number of selected component (if applicable):

vdsm-4.9-112.7.el6_2.x86_64
lvm2-2.02.87-6.el6.x86_64

How reproducible:

100%

Steps to Reproduce:
1. in a two host cluster, create and attach a new domain consisting of several luns
2.
3.
  
Actual results:

getVGInfo will get a partial list of the luns and we will not log in to all luns on hsm. host will become non-operational 

Expected results:

getVGInfo should get the list of all luns 

Additional info: full logs attached

Comment 1 Haim 2012-02-29 13:12:45 UTC
the following script reproduces the problem without engine:

in order to run it, just replace lun list in createVG command with your (make sure multipath recognizes it)

vdsClient 0 createVG  d55e3a64-0615-47ba-a500-4d81cacff52c 1avihai_lun1-700G41337842,1avihai_lun2-700G41337843,1avihai_newdc_lun1-800G41 && sleep 5 && vdsClient  0 createStorageDomain 3 d55e3a64-0615-47ba-a500-4d81cacff52c avihai-domain `vgs d55e3a64-0615-47ba-a500-4d81cacff52c -o+uuid  |awk '!/VG/{print $8}'` 1 2 && sleep 5 && vdsClient 0 getVGInfo `vgs d55e3a64-0615-47ba-a500-4d81cacff52c -o+uuid  |awk '!/VG/{print $8}'`
1JOAgI-25y9-BdbI-Zfm1-NaJM-UQYi-XOtxxk

Comment 2 Dafna Ron 2012-02-29 14:27:31 UTC
Haim and Eduardo investigated this issue further and it seems like this is cause because of lvm cache issue described in bug 

https://bugzilla.redhat.com/show_bug.cgi?id=788096

However, we need a workaround from vdsm for this issue until lvm bug is fixed since currently users will not be able to create vg's from multiple targets on our hosts without a manual login to the luns.

Comment 3 Bill Sanford 2012-03-16 18:17:03 UTC
I just installed ic155.1 on a RHEL 6.2 GA server. I added two hosts. I then added  ISCSI storage and as soon as the SPM was determined, the other host went non-operational. 

On both hosts are installed vdsm-4.9-112.6.el6_2.x86_64.

Comment 4 Bill Sanford 2012-03-16 18:20:09 UTC
Also the message in RHEV-M was "Host <hostname> cannot access one of the Storage Domains attached to it, or the Data Center object. Setting the Host state to Non-Operational.

Comment 6 Eduardo Warszawski 2012-07-03 15:32:06 UTC
*** Bug 836720 has been marked as a duplicate of this bug. ***

Comment 7 Eduardo Warszawski 2012-07-03 16:13:22 UTC
In getVGInfo the output of pvs is parsed.
This output can be inconsistent due to lvm BZ#836663.

When a device list is passed as an argument for pvs, and not all the devices have metadata in use (as in vdsm), the device order changes the pvs output.
(See below.)

Possible solutions:
1) Require an lvm version including BZ# 836663 fix for use with vdsm.
2) Select the pv with the mda in use as the 1st parameter in vdsm pvs cmd.
2) Parse the output of vgs -o +pv_name instead of the pvs response.

[root@derez ~]# vgs -o pv_name 07622fad-381b-4b1d-b534-f9db364032c2
  PV                                           
  /dev/mapper/3600144f09dbd050000004e1994c60005
  /dev/mapper/3600144f09dbd050000004ddbe989001b
[root@derez ~]# 
[root@derez ~]# pvs -o pv_name,vg_name,pv_attr,pv_mda_used_count /dev/mapper/3600144f09dbd050000004e1994c60005 /dev/mapper/3600144f09dbd050000004ddbe989001b
  PV                                            VG                                   Attr #PMdaUse
  /dev/mapper/3600144f09dbd050000004ddbe989001b 07622fad-381b-4b1d-b534-f9db364032c2 a--         0
  /dev/mapper/3600144f09dbd050000004e1994c60005 07622fad-381b-4b1d-b534-f9db364032c2 a--         2
[root@derez ~]# 
[root@derez ~]# pvs -o pv_name,vg_name,pv_attr,pv_mda_used_count /dev/mapper/3600144f09dbd050000004ddbe989001b /dev/mapper/3600144f09dbd050000004e1994c60005
  PV                                            VG                                   Attr #PMdaUse
  /dev/mapper/3600144f09dbd050000004ddbe989001b                                      a--         0
  /dev/mapper/3600144f09dbd050000004e1994c60005 07622fad-381b-4b1d-b534-f9db364032c2 a--         2

Comment 18 Eduardo Warszawski 2012-10-17 16:51:10 UTC
Change-Id: I9fdf09a2637f096201f094918654e3f52663bc2d

Comment 20 Daniel Paikov 2012-11-06 17:01:53 UTC
Checked on si24.

Comment 22 errata-xmlrpc 2012-12-04 18:54:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1508.html

Comment 23 Allon Mureinik 2013-01-20 20:48:52 UTC
*** Bug 896505 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.