Bug 1159637 - [JSONRPC] Extend size operation of an attached iSCSi domain, fail with CDA
Summary: [JSONRPC] Extend size operation of an attached iSCSi domain, fail with CDA
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.0
Assignee: Piotr Kliczewski
QA Contact: Ori Gofen
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-02 13:39 UTC by Ori Gofen
Modified: 2016-05-26 01:51 UTC (History)
16 users (show)

Fixed In Version: vdsm-4.16.7.6-1.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-16 13:39:11 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs+image (1.26 MB, application/x-bzip)
2014-11-02 13:39 UTC, Ori Gofen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 35014 0 master MERGED jsonrpc: Wrong key name for getDevicesVisibility Never
oVirt gerrit 35135 0 master MERGED jsonrpc: wrong response key name Never
oVirt gerrit 35223 0 ovirt-engine-3.5 MERGED jsonrpc: wrong response key name Never
oVirt gerrit 35306 0 ovirt-3.5 MERGED jsonrpc: Wrong key name for getDevicesVisibility Never

Description Ori Gofen 2014-11-02 13:39:57 UTC
Created attachment 952906 [details]
logs+image

Description of problem:

The operation of detach-attach iSCSi storage domain fail to expose all Luns to the receiving Host.
Upon attach the receiving host reports on 3 session out of 4 , while it is logged in to each session twice:

root@RHEL7Server /rhev/data-center
 # iscsiadm -m session
tcp: [1] 10.35.160.107:3260,4 iqn.1992-04.com.emc:cx.ckm00121000438.b7 (non-flash)
tcp: [2] 10.35.160.107:3260,4 iqn.1992-04.com.emc:cx.ckm00121000438.b7 (non-flash)
tcp: [3] 10.35.160.105:3260,2 iqn.1992-04.com.emc:cx.ckm00121000438.a7 (non-flash)
tcp: [4] 10.35.160.105:3260,2 iqn.1992-04.com.emc:cx.ckm00121000438.a7 (non-flash)
tcp: [5] 10.35.160.104:3260,1 iqn.1992-04.com.emc:cx.ckm00121000438.a6 (non-flash)
tcp: [6] 10.35.160.106:3260,3 iqn.1992-04.com.emc:cx.ckm00121000438.b6 (non-flash)
tcp: [7] 10.35.160.106:3260,3 iqn.1992-04.com.emc:cx.ckm00121000438.b6 (non-flash)

at the UI discovery window receiving host is connected to all sessions.

The operation of attach is succesfull,but when we attempt, edit->extend size, the operation fails with CDA(see image)

engine throws:

2014-11-02 15:17:37,452 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] Failed in GetDevicesVisibilityVDS method
2014-11-02 15:17:37,453 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand return value

DevicesVisibilityMapReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=-32603, mMessage=Internal JSON-RPC error.]]

2014-11-02 15:17:37,458 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] HostName = 10.35.2.53
2014-11-02 15:17:37,459 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] Command GetDevicesVisibilityVDSCommand(HostName = 10.35.2.53, HostId = 46ceb82f-c1ca-4511-a840-bf1067b12e35, devicesIds=[360060160f4a03000b1f68463c259e411, 360060160f4a03000acefe985b05ee411, 360060160f4a030000c87bc4db05ee411]) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDevicesVisibilityVDS, error = Internal JSON-RPC error., code = -32603
2014-11-02 15:17:37,459 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] FINISH, GetDevicesVisibilityVDSCommand, log id: 1477bcc7
2014-11-02 15:17:37,459 ERROR [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] Command org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDevicesVisibilityVDS, error = Internal JSON-RPC error., code = -32603 (Failed with error unexpected and code 16)
2014-11-02 15:17:37,482 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-7) [8b1c200] Correlation ID: 8b1c200, Job ID: da163531-36ef-4530-b20a-f4e8b6e3efc4, Call Stack: null, Custom Event ID: -1, Message: Failed to connect Host 10.35.2.53 to device. (User: admin)
2014-11-02 15:17:37,488 WARN  [org.ovirt.engine.core.bll.storage.ExtendSANStorageDomainCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] CanDoAction of action ExtendSANStorageDomain failed. Reasons:VAR__TYPE__STORAGE__DOMAIN,VAR__ACTION__EXTEND,ERROR_CANNOT_EXTEND_CONNECTION_FAILED,$hostName 10.35.2.53,$lun


Version-Release number of selected component (if applicable):
vt8

How reproducible:
100%

Steps to Reproduce:
* Have a storage group with 4 iqn's
1.(on first dc) Create iSCSi domain
2.(on first dc) Create NFS Domain
3.(on second dc) Craete NFS domain
4. detach iSCSi domain and attach it to second dc
5. extend it's size (choose all available Luns if you don't know which one will fail the operation)

Actual results:
The operation fails with CDA

Expected results:
operation should be successful

Additional info:
ogofen-vdsb (receiving host)

Comment 2 Allon Mureinik 2014-11-06 16:02:26 UTC
What sticks out for me is this error:

2014-11-02 15:17:37,453 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (ajp-/127.0.0.1:8702-7) [8b1c200] Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand return value

DevicesVisibilityMapReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=-32603, mMessage=Internal JSON-RPC error.]]


Piotr, can you take a look please?

Comment 3 Piotr Kliczewski 2014-11-07 07:52:08 UTC
This error code is returned when something unexpected happened during invocation of a verb.

By looking at vdsm logs I found:

Thread-235::DEBUG::2014-11-02 15:17:37,013::lvm::288::Storage.Misc.excCmd::(cmd) FAILED: <err> = '  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!\n  TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.\n  Couldn\'t find device with uuid O2lLNq-ySlq-WSPd-6MsF-VPfN-ZUQ8-05vjJn.\n  Couldn\'t find device with uuid Xb49tk-bHna-TNRG-tP2Y-Lgsx-Sc5P-JrDppd.\n  Couldn\'t find device with uuid RpmtJa-6nlX-c83h-ZeRE-ATGY-9MWF-2r5gsQ.\n  Couldn\'t find device with uuid daeLdr-o4yg-BHt5-4ohE-3Tlm-pA3B-WyRwAc.\n  Can\'t initialize physical volume "/dev/mapper/360060160f4a03000e9492254c259e411" of volume group "577145c7-5949-4abf-b5af-7056948e165a" without -ff\n  Can\'t initialize physical volume "/dev/mapper/360060160f4a030007da7d711c259e411" of volume group "577145c7-5949-4abf-b5af-7056948e165a" without -ff\n  Can\'t initialize physical volume "/dev/mapper/360060160f4a03000b1f68463c259e411" of volume group "95da13b9-b971-4e74-85cc-9f42d0b0df60" without -ff\n  Can\'t initialize physical volume "/dev/mapper/360060160f4a030000c87bc4db05ee411" of volume group "6c2c285d-a930-4eb9-aca8-91bf2940ee72" without -ff\n  You have a memory leak (not released memory pool):\n   [0x7fe018e6c480] format_instance\n  Internal error: Unreleased memory pool(s) found.\n'; <rc> = 5
Thread-235::DEBUG::2014-11-02 15:17:37,014::lvm::852::Storage.LVM::(testPVCreate) rc: 5, out: ['  Physical volume "/dev/mapper/360060160f4a03000acefe985b05ee411" successfully created'], err: ['  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!', '  TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.', "  Couldn't find device with uuid O2lLNq-ySlq-WSPd-6MsF-VPfN-ZUQ8-05vjJn.", "  Couldn't find device with uuid Xb49tk-bHna-TNRG-tP2Y-Lgsx-Sc5P-JrDppd.", "  Couldn't find device with uuid RpmtJa-6nlX-c83h-ZeRE-ATGY-9MWF-2r5gsQ.", "  Couldn't find device with uuid daeLdr-o4yg-BHt5-4ohE-3Tlm-pA3B-WyRwAc.", '  Can\'t initialize physical volume "/dev/mapper/360060160f4a03000e9492254c259e411" of volume group "577145c7-5949-4abf-b5af-7056948e165a" without -ff', '  Can\'t initialize physical volume "/dev/mapper/360060160f4a030007da7d711c259e411" of volume group "577145c7-5949-4abf-b5af-7056948e165a" without -ff', '  Can\'t initialize physical volume "/dev/mapper/360060160f4a03000b1f68463c259e411" of volume group "95da13b9-b971-4e74-85cc-9f42d0b0df60" without -ff', '  Can\'t initialize physical volume "/dev/mapper/360060160f4a030000c87bc4db05ee411" of volume group "6c2c285d-a930-4eb9-aca8-91bf2940ee72" without -ff', '  You have a memory leak (not released memory pool):', '   [0x7fe018e6c480] format_instance', '  Internal error: Unreleased memory pool(s) found.'], unusedDevs: set(['/dev/mapper/360060160f4a03000acefe985b05ee411']), usedDevs: set(['/dev/mapper/360060160f4a03000e9492254c259e411', '/dev/mapper/360060160f4a030007da7d711c259e411', '/dev/mapper/360060160f4a030000c87bc4db05ee411', '/dev/mapper/360060160f4a03000b1f68463c259e411'])

Is it something exepected?

Comment 4 Ori Gofen 2014-11-10 15:03:30 UTC
This bug consists from two bugs, the CDA on extend block domains has nothing to do with the iSCSi registration as I have described, it just seemed natural that that's the problem, but it isn't, the iqn duplicity bug happens on hosts with multiple nics only, and will be opened separately from this bug.

Comment 5 Ori Gofen 2014-11-10 15:12:53 UTC
Updated bug again :)
The fail is due to another json-rpc bug, couldn't reproduce on hosts that use xml-rpc.

Comment 7 Ori Gofen 2014-11-16 16:49:39 UTC
iqn duplicity bug bz #1164576

Comment 8 Dan Kenigsberg 2014-11-19 16:55:57 UTC
This is a regression versus rhev-3.4 (with xmlrpc). It must be solved in rhev-3.5.0

Comment 9 Ori Gofen 2014-12-02 11:31:00 UTC
verified on vt12


Note You need to log in before you can comment on or make changes to this bug.