Bug 1261532

Summary: overpotent DELETE on storage domain's iscsi storage connection remove all connections and all the <logical_unit> data
Product: Red Hat Enterprise Virtualization Manager Reporter: David Jaša <djasa>
Component: ovirt-engineAssignee: Maor <mlipchuk>
Status: CLOSED NOTABUG QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.4.0CC: ahino, amureini, bazulay, djasa, ecohen, gklein, lsurette, mlipchuk, rbalakri, Rhev-m-bugs, tnisan, yeylon
Target Milestone: ---   
Target Release: 3.5.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-24 16:20:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
API traffic none

Description David Jaša 2015-09-09 14:28:02 UTC
Created attachment 1071803 [details]
API traffic

Description of problem:
when issuing DELETE on /ovirt-engine/api/storagedomains/SD_UUID/storageconnections/SC_UUID, one expects that only SC_UUID connection will be removed from the storage domain and all the others will remain in place. In fact, all the storage connections get removed. In addition, all the <logical_unit> elements (from <storage_domain><storage><volume_group>) that contain iSCSI configuration are removed.

After re-adding of the connection-that-should-have-stayed-intact, the storage domain works but as the <logical_unit> data is missing, Edit domain dialog doesn't show the lun as lun with storage domain (green tick), the lun has a checkbox free to be checked - but when you try to do that (in order to reconcile RHEV DB with actual data), it seems that engine understands it as an attempt to grow the storage domain.


Version-Release number of selected component (if applicable):
RHEV 3.4-3.6 behave all the same

How reproducible:
always

Steps to Reproduce:
0. start with:
    * iscsi storage domain hosted by server with two IPs
    * the storage domain should be in maintenance 
    * at least one host should be up
1. check the storage domain and its connections using API (result in attachment)
2. remove one of the connections
3. check SD and it's connection again

Actual results:
  * all the SD's connections are removed, not just the one specified
  * other SD's configuration is removed as well

Expected results:
only the specified connection is affected

Additional info:
When this bug is fixed, the storage connection removal from SD could be performed on domain in Up state provided that at least one connection will remain active.

Comment 2 Allon Mureinik 2015-09-09 16:19:56 UTC
Can we get the engine's log too please?

Comment 4 Allon Mureinik 2015-09-13 10:42:47 UTC
Ala, please take a look at this?

Comment 5 Allon Mureinik 2015-09-13 12:14:12 UTC
Too many eggs in the same basket.
Maor - please take this over.

Comment 6 Allon Mureinik 2015-09-22 06:42:59 UTC
Maor, I see you backported a patch from master, and then abandoned it. Can you please share some details on it and explain the current status?

Comment 7 Maor 2015-09-22 08:01:40 UTC
I've been trying to reproduce this bug on my env, with no success.
I've added a new Storage Domain and deleted one of its targets through REST - All seems to work as expected.

I first thought the reason this does not reproduce is because the patch was not backported, but then I've tried to reproduce this also on a 3.5 env and this could not reproduce also.

I'm trying to investigate this some more, it might be related to the fact that VDSM only preserve the VG meta data on the first Lun, so currently Im trying to prove/disprove this theory

Comment 8 Maor 2015-09-22 08:59:30 UTC
David, it doesn't seems to reproduce on my env (3.5 and 3.6).
Also the engine logs does not indicate any error.
What I do see, is that in attachment 1071803 [details] (API traffic), the last request is :

https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc

This request is without sotrageconnections, so maybe this was the reason you didn't saw any storage connections at all.

Can you please try to use the following request instead:

https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/storageconnections 

and share the output

Comment 9 Maor 2015-09-22 09:10:52 UTC
(In reply to Maor from comment #8)
> David, it doesn't seems to reproduce on my env (3.5 and 3.6).
> Also the engine logs does not indicate any error.
> What I do see, is that in attachment 1071803 [details] (API traffic), the
> last request is :
> 
> https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-
> 4c66-bbff-49b595265afc
> 
> This request is without sotrageconnections, so maybe this was the reason you
> didn't saw any storage connections at all.
> 
> Can you please try to use the following request instead:
> 
> https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-
> 4c66-bbff-49b595265afc/storageconnections 
> 
> and share the output

I looked at it again, and I can see that in the API traffic you did call the mentioned request, but still I can't seem to reproduce this.
Can you please try to elaborate more on the reproduce steps, and try to reproduce this on a new env.

Comment 10 David Jaša 2015-09-22 09:44:59 UTC
Storage connections are defined (as of now, after another change):


$ curl -b .cookies/rhevm34-admin -H 'prefer: persistent-auth' https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<storage_domain href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc" id="bbeffae4-d72f-4c66-bbff-49b595265afc">
    <name>rhevm34</name>
    <link href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/permissions" rel="permissions"/>
    <link href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/disks" rel="disks"/>
    <link href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/storageconnections" rel="storageconnections"/>
    <type>data</type>
    <master>true</master>
    <storage>
        <type>iscsi</type>
        <volume_group id="tZdnJP-02XN-HpZ9-frwo-y11w-ACfd-UBeySe"/>
    </storage>
    <available>150323855360</available>
    <used>768799145984</used>
    <committed>1902670512128</committed>
    <storage_format>v3</storage_format>
</storage_domain>


$ curl -b .cookies/rhevm34-admin -H 'prefer: persistent-auth' https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/storageconnections
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<storage_connections>
    <storage_connection href="/ovirt-engine/api/storageconnections/cd6894e8-112c-4458-901d-f52a3acfc15d" id="cd6894e8-112c-4458-901d-f52a3acfc15d">
        <address>new_IP1</address>
        <type>iscsi</type>
        <port>3260</port>
        <target>iqn.1986-03.com.hp:storage.p2000g3.1235158e24</target>
    </storage_connection>
    <storage_connection href="/ovirt-engine/api/storageconnections/061fe262-6be4-45a6-bfec-0fc58d9f11cd" id="061fe262-6be4-45a6-bfec-0fc58d9f11cd">
        <address>new_IP2</address>
        <type>iscsi</type>
        <port>3260</port>
        <target>iqn.1986-03.com.hp:storage.p2000g3.1235158e24</target>
    </storage_connection>
</storage_connections>


There are also interesting side effects: the data domain works, VMs can be started etc. but:
  * iso domain can't be activated. The engine Events say that no host can't reach
    the domain which isn't true
  * in New Domain/Edit Domain dialogs, the iscsi LUN occupied by the domain is
    marked as free and attempt to check & save results RHEV trying to "extend" the
    VG to the LUN which actually already is there
  * the same occurs when trying to update <storage> element of the domain in API

Comment 12 Maor 2015-09-22 10:12:56 UTC
(In reply to David Jaša from comment #10)
> Storage connections are defined (as of now, after another change):

What that means? Did it managed to reproduce for you?

> 
> 
> $ curl -b .cookies/rhevm34-admin -H 'prefer: persistent-auth'
> https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-
> 4c66-bbff-49b595265afc
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <storage_domain
> href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc"
> id="bbeffae4-d72f-4c66-bbff-49b595265afc">
>     <name>rhevm34</name>
>     <link
> href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/
> permissions" rel="permissions"/>
>     <link
> href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/
> disks" rel="disks"/>
>     <link
> href="/ovirt-engine/api/storagedomains/bbeffae4-d72f-4c66-bbff-49b595265afc/
> storageconnections" rel="storageconnections"/>
>     <type>data</type>
>     <master>true</master>
>     <storage>
>         <type>iscsi</type>
>         <volume_group id="tZdnJP-02XN-HpZ9-frwo-y11w-ACfd-UBeySe"/>
>     </storage>
>     <available>150323855360</available>
>     <used>768799145984</used>
>     <committed>1902670512128</committed>
>     <storage_format>v3</storage_format>
> </storage_domain>
> 
> 
> $ curl -b .cookies/rhevm34-admin -H 'prefer: persistent-auth'
> https://rhevm34.example.com/ovirt-engine/api/storagedomains/bbeffae4-d72f-
> 4c66-bbff-49b595265afc/storageconnections
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <storage_connections>
>     <storage_connection
> href="/ovirt-engine/api/storageconnections/cd6894e8-112c-4458-901d-
> f52a3acfc15d" id="cd6894e8-112c-4458-901d-f52a3acfc15d">
>         <address>new_IP1</address>
>         <type>iscsi</type>
>         <port>3260</port>
>         <target>iqn.1986-03.com.hp:storage.p2000g3.1235158e24</target>
>     </storage_connection>
>     <storage_connection
> href="/ovirt-engine/api/storageconnections/061fe262-6be4-45a6-bfec-
> 0fc58d9f11cd" id="061fe262-6be4-45a6-bfec-0fc58d9f11cd">
>         <address>new_IP2</address>
>         <type>iscsi</type>
>         <port>3260</port>
>         <target>iqn.1986-03.com.hp:storage.p2000g3.1235158e24</target>
>     </storage_connection>
> </storage_connections>
> 
> 
> There are also interesting side effects: the data domain works, VMs can be
> started etc. but:
>   * iso domain can't be activated. The engine Events say that no host can't
> reach
>     the domain which isn't true

Can you trouble shoot this problem, Was the Host mount the ISO domain.
Can you please open a separate bug on that

>   * in New Domain/Edit Domain dialogs, the iscsi LUN occupied by the domain
> is
>     marked as free and attempt to check & save results RHEV trying to
> "extend" the
>     VG to the LUN which actually already is there

Weird, on my env it works as expected, we need to check in the engine logs what does GetDeviceListVDSCommand returns, can u please open a separate bug on that issue with the engine log attached to it.

>   * the same occurs when trying to update <storage> element of the domain in
> API

Comment 16 Maor 2015-09-24 13:52:15 UTC
David,

Please share your comments if something was not clear,
or you think I misunderstood anything

Regards,
Maor

Comment 17 David Jaša 2015-09-24 15:02:08 UTC
Makes sense, thank you.

Comment 18 Maor 2015-09-24 16:20:26 UTC
Thanks for the quick respond David,
I'm closing this bug for now, please let me know if there are any rejections