Bug 1161021 - [BLOCKED] Attach of Imported Storage Domain (still attached in the meta data) fails when calling force detach with json RPC
Summary: [BLOCKED] Attach of Imported Storage Domain (still attached in the meta data)...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-core
Version: 3.5
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.1
Assignee: Maor
QA Contact: Pavel Stehlik
URL:
Whiteboard: storage
Depends On:
Blocks: 1162283 1193195
TreeView+ depends on / blocked
 
Reported: 2014-11-06 07:45 UTC by Maor
Modified: 2016-02-10 19:44 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-11-18 07:33:40 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
engine log (7.08 MB, text/plain)
2014-11-06 07:45 UTC, Maor
no flags Details
vdsm.log (12.35 MB, text/plain)
2014-11-07 11:44 UTC, Raul Laansoo
no flags Details
dmesg.log (204.46 KB, text/plain)
2014-11-07 13:59 UTC, Raul Laansoo
no flags Details
node logs (112.18 KB, application/x-gzip)
2014-11-07 15:52 UTC, Raul Laansoo
no flags Details

Description Maor 2014-11-06 07:45:50 UTC
Created attachment 954332 [details]
engine log

Description of problem:
The user tried to import an existing FC storage domain and failed with the following errors in the engine:

[DetachStorageDomainVDSCommand] (ajp--127.0.0.1-8702-2) [69fba16c] Could not force detach domain 46243ce5-face-483e-9a40-7daea77d82a3 on pool 4e14574d-9472-4e4a-a44a-140acbb790bb. error: org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSGenericException: IRSErrorException: Failed to DetachStorageDomainVDS, error = detach() takes exactly 5 arguments (3 given), code = -32603
2014-11-04 09:47:31,980 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DetachStorageDomainVDSCommand] (ajp--127.0.0.1-8702-2) [69fba16c] FINISH, DetachStorageDomainVDSCommand, log id: 5b4d7f9a

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create a Storage Domain with FC on a Data Center using Host with json RPC
2. Destroy the domain while it is still attached to the Data Center
3. Try to import the Storage Domain to another Data Center

Actual results:
Fails to attach the Storage Domain since the detach is failed in the VDSM on the error mentioned above.

Expected results:
The Storage Domain should be attached

Additional info:
The error only reproduce when the user use json RPC.

Comment 1 Barak 2014-11-06 12:51:52 UTC
Maor,

1 - I assume it works with XML-RPC ?
2 - what is the Destroy operation mentioned in #2 of the description ?
3 - was it detached first from the initial ovirt setup it was attached to ?

Comment 2 Raul Laansoo 2014-11-06 14:01:27 UTC
1. Yes, I can import existing FC storage domain when I de-select 'Use JSON protocol' in host settings. Without this, I get the error.
2. It basically means that I do live snapshot of all LUNs on NetAPP and then attach those LUNs to some other FC datacenter.
3. No, it's recovery scenario. No orderly detach from initial oVirt data center.

Comment 3 Allon Mureinik 2014-11-06 15:56:14 UTC
Maor, can you add VDSM's log too please?

Comment 4 Piotr Kliczewski 2014-11-07 11:28:32 UTC
Please provide vdsm logs and specify vdsm version that you used for testing.

Comment 5 Raul Laansoo 2014-11-07 11:44:34 UTC
Created attachment 954905 [details]
vdsm.log

Attached logs.

vdsm-4.16.0-3.git601f786.el6

Comment 6 Piotr Kliczewski 2014-11-07 13:19:48 UTC
It seems that you are using quite old version of vdsm. Can you please update and retest?

Comment 7 Raul Laansoo 2014-11-07 13:59:11 UTC
Created attachment 954930 [details]
dmesg.log

Yes, I know. This is because another problem I experience. Every time I configure node with latest vdsm packages (either install them on bare CentOS or use oVirt node image), after adding node to the engine, all my storage devices go offline -- all multipath devices fail or disks got corrupted after vdsm service restart. I have attached the output of dmesg of one such case.

Comment 8 Piotr Kliczewski 2014-11-07 14:19:13 UTC
Compared the code with the logs that you provided and I can see that patches from beginning of August are missing so you vdsm is older. In the meantime we experienced the issue that you found but it was fixed.

I think that we need to fix your issue with latest vdsm and the problem that you reported in this bug will go away.

Allon can you help to solve multipath issue?

Comment 9 Raul Laansoo 2014-11-07 15:52:42 UTC
Created attachment 954997 [details]
node logs

I have attached all the logs from failed node.

Comment 10 Allon Mureinik 2014-11-10 15:32:19 UTC
(In reply to Raul Laansoo from comment #7)
> Created attachment 954930 [details]
> dmesg.log
> 
> Yes, I know. This is because another problem I experience. Every time I
> configure node with latest vdsm packages (either install them on bare CentOS
> or use oVirt node image), after adding node to the engine, all my storage
> devices go offline -- all multipath devices fail or disks got corrupted
> after vdsm service restart. I have attached the output of dmesg of one such
> case.

(In reply to Piotr Kliczewski from comment #8)
> Allon can you help to solve multipath issue?

Maor, please take this one.

Comment 11 Maor 2014-11-10 15:51:07 UTC
(In reply to Allon Mureinik from comment #10)
> (In reply to Raul Laansoo from comment #7)
> > Created attachment 954930 [details]
> > dmesg.log
> > 
> > Yes, I know. This is because another problem I experience. Every time I
> > configure node with latest vdsm packages (either install them on bare CentOS
> > or use oVirt node image), after adding node to the engine, all my storage
> > devices go offline -- all multipath devices fail or disks got corrupted
> > after vdsm service restart. I have attached the output of dmesg of one such
> > case.
> 
> (In reply to Piotr Kliczewski from comment #8)
> > Allon can you help to solve multipath issue?
> 
> Maor, please take this one.

I prefer to open a new bug on that, this bug is only relevant for the json scenario.
Raul, can you please open a new bug describing the steps to reproduce?

Comment 13 Barak 2014-11-12 16:40:26 UTC
Changed the whiteboard to storage - the infra (json-rpc) related issues were already fixed.

Comment 14 Raul Laansoo 2014-11-18 07:00:28 UTC
I have now fixed the problem with multipath and attaching storage domain works. I see only one error:

2014-11-18 08:51:25,711 ERROR [org.ovirt.engine.core.bll.GetUnregisteredDisksQuery] (ajp--127.0.0.1-8702-1) [48368e54] Could not get popula
ted disk, reason: null

Is it at all possible to recover VMs from v3.4 storage domain imported into 3.5 datacenter or is this only possible with v3.5 storage domains?

Comment 15 Maor 2014-11-18 07:31:28 UTC
(In reply to Raul Laansoo from comment #14)
> I have now fixed the problem with multipath and attaching storage domain
> works. I see only one error:
> 
> 2014-11-18 08:51:25,711 ERROR
> [org.ovirt.engine.core.bll.GetUnregisteredDisksQuery]
> (ajp--127.0.0.1-8702-1) [48368e54] Could not get popula
> ted disk, reason: null

Great!
This is a redundant warning, you can ignore this it should not influence the process.
There is already an open bug on this: https://bugzilla.redhat.com/1136808
and also a patch for fixing it: http://gerrit.ovirt.org/#/c/33154/

> 
> Is it at all possible to recover VMs from v3.4 storage domain imported into
> 3.5 datacenter or is this only possible with v3.5 storage domains?

You can only recover Storage Domains which contained OVF_STORE disk, currently this is only supported for 3.5 Data Center.

I've changed the wiki to indicate this:
http://www.ovirt.org/Features/ImportStorageDomain#General_Functionality

Comment 16 Maor 2014-11-18 07:33:40 UTC
Since it seems that attaching storage domain works with json and it does not reproduce, I'm closing the bug.

Please reopen if you see that it reproduce again.


Note You need to log in before you can comment on or make changes to this bug.