Bug 1539108 - Silence bogus error when trying to check clean LUN status in getDeviceList
Summary: Silence bogus error when trying to check clean LUN status in getDeviceList
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.15
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ovirt-4.2.2
: ---
Assignee: shani
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-26 16:53 UTC by Nir Soffer
Modified: 2018-03-29 10:57 UTC (History)
7 users (show)

Fixed In Version: vdsm v4.20.18
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-29 10:57:54 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1533035 0 unspecified CLOSED [Scale] getDeviceList slowdown when passing big list of guids 2021-02-22 00:41:40 UTC
oVirt gerrit 86032 0 master MERGED hsm: improve performance of getDeviceList(guids) 2018-01-29 07:59:13 UTC

Internal Links: 1533035

Description Nir Soffer 2018-01-26 16:53:40 UTC
Description of problem:

When creating a new storage domain, we:
1. Get list of devices using getDeviceList
2. Use select devices for the new storage domain
3. We check if the devices can be used to create a new PV

If the selected devices is not a PV yet, we get confusing error in vdsm log:

1. Getting available devices

2018-01-25 21:00:57,214-0500 INFO  (jsonrpc/0) [vdsm.api] START getDeviceList(storageType=3, guids=(), checkStatus=False, options={}) from=::ffff:192.168.201.4,46194, flow_id=ef040cd5-361d-4feb-98a8-d67a54241713, task_id=bd5fd679-de8b-47c3-be62-ced43051ec79 (api:46)

2018-01-25 21:00:57,617-0500 INFO  (jsonrpc/0) [vdsm.api] FINISH getDeviceList return={'devList': [{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'initiatorname': u'default', 'connection': u'192.168.200.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}, {'initiatorname': u'default', 'connection': u'192.168.201.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}], 'logicalblocksize': '512', 'discard_max_bytes': 1073741824, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sdb', 'capacity': '21474836480', 'state': 'active', 'lun': '1'}, {'type': 'iSCSI', 'physdev': 'sdg', 'capacity': '21474836480', 'state': 'active', 'lun': '1'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_lun1_bdev_4552fe37-3c40-4f78-8ea3-a43782040d20', 'GUID': '360014054552fe373c404f788ea3a4378', 'productID': 'lun1_bdev'}, {'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'initiatorname': u'default', 'connection': u'192.168.200.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}, {'initiatorname': u'default', 'connection': u'192.168.201.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}], 'logicalblocksize': '512', 'discard_max_bytes': 1073741824, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sdc', 'capacity': '21474836480', 'state': 'active', 'lun': '2'}, {'type': 'iSCSI', 'physdev': 'sdh', 'capacity': '21474836480', 'state': 'active', 'lun': '2'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_lun2_bdev_66b26ed0-e4f2-4fea-bd13-6bf33bad8790', 'GUID': '3600140566b26ed0e4f24feabd136bf33', 'productID': 'lun2_bdev'}, {'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'initiatorname': u'default', 'connection': u'192.168.200.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}, {'initiatorname': u'default', 'connection': u'192.168.201.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}], 'logicalblocksize': '512', 'discard_max_bytes': 1073741824, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sdd', 'capacity': '21474836480', 'state': 'active', 'lun': '3'}, {'type': 'iSCSI', 'physdev': 'sdi', 'capacity': '21474836480', 'state': 'active', 'lun': '3'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_lun3_bdev_5c94d39c-8c41-4e07-ac28-5ffad57f4737', 'GUID': '360014055c94d39c8c414e07ac285ffad', 'productID': 'lun3_bdev'}, {'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'initiatorname': u'default', 'connection': u'192.168.200.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}, {'initiatorname': u'default', 'connection': u'192.168.201.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}], 'logicalblocksize': '512', 'discard_max_bytes': 1073741824, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sde', 'capacity': '21474836480', 'state': 'active', 'lun': '4'}, {'type': 'iSCSI', 'physdev': 'sdj', 'capacity': '21474836480', 'state': 'active', 'lun': '4'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_lun4_bdev_cb451d57-1edc-49d4-b58a-60e67ccc3398', 'GUID': '36001405cb451d571edc49d4b58a60e67', 'productID': 'lun4_bdev'}, {'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'initiatorname': u'default', 'connection': u'192.168.200.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}, {'initiatorname': u'default', 'connection': u'192.168.201.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}], 'logicalblocksize': '512', 'discard_max_bytes': 1073741824, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sda', 'capacity': '21474836480', 'state': 'active', 'lun': '0'}, {'type': 'iSCSI', 'physdev': 'sdf', 'capacity': '21474836480', 'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_lun0_bdev_daa12eb5-82b3-4c92-9e40-2a85e13b0c45', 'GUID': '36001405daa12eb582b34c929e402a85e', 'productID': 'lun0_bdev'}]} from=::ffff:192.168.201.4,46194, flow_id=ef040cd5-361d-4feb-98a8-d67a54241713, task_id=bd5fd679-de8b-47c3-be62-ced43051ec79 (api:52)

2018-01-25 21:00:57,619-0500 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.getDeviceList succeeded in 0.40 seconds (__init__:573)


2. Check status of LUN 360014054552fe373c404f788ea3a4378

2018-01-25 21:00:57,749-0500 INFO  (jsonrpc/1) [vdsm.api] START getDeviceList(storageType=3, guids=[u'360014054552fe373c404f788ea3a4378'], checkStatus=False, options={}) from=::ffff:192.168.201.4,46194, flow_id=ef040cd5-361d-4feb-98a8-d67a54241713, task_id=90e19c5e-570f-4971-8376-36bbae54cd59 (api:46)


3. Expected error is not handled correctly:

2018-01-25 21:00:58,092-0500 WARN  (jsonrpc/1) [storage.LVM] lvm pvs failed: 5 [] ['  Failed to find physical volume "/dev/mapper/360014054552fe373c404f788ea3a4378".'] (lvm:322)
2018-01-25 21:00:58,092-0500 WARN  (jsonrpc/1) [storage.HSM] getPV failed for guid: 360014054552fe373c404f788ea3a4378 (hsm:1984)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1981, in _getDeviceList
    pv = lvm.getPV(guid)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 844, in getPV
    raise se.InaccessiblePhysDev((pvName,))
InaccessiblePhysDev: Multipath cannot access physical device(s): "devices=(u'360014054552fe373c404f788ea3a4378',)"

Sure this PV is not available, we did not create the PV yet. This is an 
expected condition should not be logged with a traceback. A debug log
would be best.


4. We return the device status, and the call succeed

2018-01-25 21:00:58,126-0500 INFO  (jsonrpc/1) [vdsm.api] FINISH getDeviceList return={'devList': [{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'initiatorname': u'default', 'connection': u'192.168.200.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}, {'initiatorname': u'default', 'connection': u'192.168.201.4', 'iqn': u'iqn.2014-07.org.ovirt:storage', 'portal': '1', 'user': u'username', 'password': u'password', 'port': '3260'}], 'logicalblocksize': '512', 'discard_max_bytes': 1073741824, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sdb', 'capacity': '21474836480', 'state': 'active', 'lun': '1'}, {'type': 'iSCSI', 'physdev': 'sdg', 'capacity': '21474836480', 'state': 'active', 'lun': '1'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_lun1_bdev_4552fe37-3c40-4f78-8ea3-a43782040d20', 'GUID': '360014054552fe373c404f788ea3a4378', 'productID': 'lun1_bdev'}]} from=::ffff:192.168.201.4,46194, flow_id=ef040cd5-361d-4feb-98a8-d67a54241713, task_id=90e19c5e-570f-4971-8376-36bbae54cd59 (api:52)
2018-01-25 21:00:58,127-0500 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getDeviceList succeeded in 0.38 seconds (__init__:573)


Version-Release number of selected component (if applicable):
Any

How reproducible:
Always, we see this on every OST build.

Steps to Reproduce:
1. Create storage domain with a new empty LUN

Actual results:
Confusing errors in vdsm log

Expected results:
Expected condition should not be logged as ERROR or with a traceback.

Suggested fix:
Handle this error in getDeviceList and log a DEBUG message (Cannot find a PV xxxyyy)

This should be fixed by the proposed patch for bug 1533035, but that
fix requires lot of testing, while fixing this is trivial.

Suggested target: 4.2.z

Comment 1 Allon Mureinik 2018-01-28 21:10:58 UTC
Shani, Nir, could this BZ be the root cause for bug 1537604?

Comment 2 Nir Soffer 2018-01-29 06:38:51 UTC
(In reply to Allon Mureinik from comment #1)
> Shani, Nir, could this BZ be the root cause for bug 1537604?

This is not related to bug 1537604. This bug is about the bogus error then failing
to find a PV. This error does not change the flow.

Comment 3 Nir Soffer 2018-01-29 07:33:46 UTC
Tal, this is fixed by the patch for bug 1533035, targeted for 4.2.2. I think we
should target this bug to 4.2.2 as well.

Comment 4 Kevin Alon Goldblatt 2018-02-19 16:39:40 UTC
Verified with the following code;
-----------------------------------------
ovirt-engine-4.2.2-0.1.el7.noarch
vdsm-4.20.18-1.el7ev.x86_64

Verified with the following scenario:
-----------------------------------------
1. Created a new Storage domain with a new LUN

No errors are reported - getDiviceList is successfull

Comment 5 Sandro Bonazzola 2018-03-29 10:57:54 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.