Created attachment 1411665 [details] sosreport from alma03 Description of problem: Deployment on 54GB LUN fails with unreadable error. [ INFO ] TASK [Get iSCSI LUNs] [ INFO ] ok: [localhost] The following luns have been found on the requested target: [1] 3514f0c5a5160167e 54GiB XtremIO XtremApp status: free, paths: 1 active Please select the destination LUN (1) [1]: [ INFO ] iSCSI discard after delete is enabled [ INFO ] Creating Storage Domain [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Obtain SSO token using username/password credentials] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch host facts] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch cluster id] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch cluster facts] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch datacenter facts] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch datacenter id] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch datacenter_name] [ INFO ] ok: [localhost] [ INFO ] TASK [Add nfs storage domain] [ INFO ] skipping: [localhost] [ INFO ] TASK [Add glusterfs storage domain] [ INFO ] skipping: [localhost] [ INFO ] TASK [Add iSCSI storage domain] [ ERROR ] Error: Fault reason is "Operation Failed". Fault detail is "[]". HTTP response code is 400. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[]\". HTTP response code is 400."} Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs)[nfs]: Version-Release number of selected component (if applicable): ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch rhvm-appliance-4.2-20180202.0.el7.noarch Linux 3.10.0-861.el7.x86_64 #1 SMP Wed Mar 14 10:21:01 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo) How reproducible: 100% Steps to Reproduce: 1.Deploy SHE over 54GB LUN on iSCSI. Actual results: [ ERROR ] Error: Fault reason is "Operation Failed". Fault detail is "[]". HTTP response code is 400. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[]\". HTTP response code is 400."} Expected results: msg: "Error: the target storage domain contains only {{ storage_domain_details.ansible_facts.ovirt_storage_domains[0].available|int / 1024 /1024 }}Mb of available space while a minimum of {{ required_size|int / 1024 /1024 }}Mb is required" Additional info: See also https://bugzilla.redhat.com/show_bug.cgi?id=1522737 Sosreport from host is attached.
I think it's just a bad configuration on SAN side. [root@alma03 ~]# iscsiadm -m session iscsiadm: No active sessions. [root@alma03 ~]# iscsiadm -m discovery -t sendtargets -p 10.35.146.129:3260 10.35.146.129:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 10.35.146.161:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 10.35.146.193:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 10.35.146.225:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 [root@alma03 ~]# iscsiadm -m node -T iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 -l -p 10.35.146.129:3260 iscsiadm: No records found So we have a target portal group (with tag 1) with four portals (10.35.146.129:3260, 10.35.146.161:3260, 10.35.146.193:3260, 10.35.146.225:3260) but each of them exposes a different target IQN while I'd expect the same IQN on the whole target portal group. So when we try to access iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 on 10.35.146.129:3260 it fails since iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 is exposed just by 10.35.146.225:3260 while 10.35.146.129:3260 exposes just iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00.
(In reply to Simone Tiraboschi from comment #1) > I think it's just a bad configuration on SAN side. > > [root@alma03 ~]# iscsiadm -m session > iscsiadm: No active sessions. > [root@alma03 ~]# iscsiadm -m discovery -t sendtargets -p 10.35.146.129:3260 > 10.35.146.129:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 > 10.35.146.161:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 > 10.35.146.193:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 > 10.35.146.225:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 > [root@alma03 ~]# iscsiadm -m node -T > iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 -l -p > 10.35.146.129:3260 > iscsiadm: No records found > > > So we have a target portal group (with tag 1) with four portals > (10.35.146.129:3260, 10.35.146.161:3260, 10.35.146.193:3260, > 10.35.146.225:3260) but each of them exposes a different target IQN while > I'd expect the same IQN on the whole target portal group. > So when we try to access > iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 on > 10.35.146.129:3260 it fails since > iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 is exposed just by > 10.35.146.225:3260 while 10.35.146.129:3260 exposes just > iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00. Its not a problem. We're working with this topology over 5 years already and never had any issues with it whatsoever.
[ INFO ] changed: [localhost] Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs)[nfs]: iscsi Please specify the iSCSI portal IP address: 10.35.146.129 Please specify the iSCSI portal port [3260]: Please specify the iSCSI discover user: Please specify the iSCSI discover password: Please specify the iSCSI portal login user: Please specify the iSCSI portal login password: [ INFO ] Discovering iSCSI targets [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Obtain SSO token using username/password credentials] [ INFO ] ok: [localhost] [ INFO ] TASK [Prepare iSCSI parameters] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch host facts] [ INFO ] ok: [localhost] [ INFO ] TASK [iSCSI discover with REST API] [ INFO ] ok: [localhost] The following targets have been found: [1] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 TPGT: 1, portals: 10.35.146.225:3260 [2] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 TPGT: 1, portals: 10.35.146.193:3260 [3] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 TPGT: 1, portals: 10.35.146.161:3260 [4] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 TPGT: 1, portals: 10.35.146.129:3260 Please select a target (1, 2, 3, 4) [1]: [ INFO ] ok: [localhost] The following luns have been found on the requested target: [1] 3514f0c5a5160167e 54GiB XtremIO XtremApp status: free, paths: 1 active Please select the destination LUN (1) [1]: You can clearly see above, that I chose one of four paths [1] and LUN was seen through that target [1], right after logging in to it.
How is this related to the lun size?
(In reply to Doron Fediuck from comment #4) > How is this related to the lun size? If you're asking me on why this is happening on 54GB and not on 60GB? I really have no idea. On regular 60 or 70GB LUN deployment was working just fine.
Forth to our findings, this is not related to LUN's size. The issue is that ansible flow is consuming the portal IP used for the discovery. I'm using one of 4 IPs of iSCSI targets for initial iSCSI discovery and then using another IP from 4 available for actual deployment. On Node 0 this is not working properly and the used target of 10.35.146.129:3260 being used as deployment target, although I've intentionally defined 10.35.146.225:3260 for deployment in CLI. [ INFO ] changed: [localhost] Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs)[nfs]: iscsi Please specify the iSCSI portal IP address: 10.35.146.129 Please specify the iSCSI portal port [3260]: Please specify the iSCSI discover user: Please specify the iSCSI discover password: Please specify the iSCSI portal login user: Please specify the iSCSI portal login password: [ INFO ] Discovering iSCSI targets [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Obtain SSO token using username/password credentials] [ INFO ] ok: [localhost] [ INFO ] TASK [Prepare iSCSI parameters] [ INFO ] ok: [localhost] [ INFO ] TASK [Fetch host facts] [ INFO ] ok: [localhost] [ INFO ] TASK [iSCSI discover with REST API] [ INFO ] ok: [localhost] The following targets have been found: [1] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 TPGT: 1, portals: 10.35.146.225:3260 [2] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 TPGT: 1, portals: 10.35.146.193:3260 [3] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 TPGT: 1, portals: 10.35.146.161:3260 [4] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 TPGT: 1, portals: 10.35.146.129:3260 Please select a target (1, 2, 3, 4) [1]: On vintage the same flow is working just fine. Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: iscsi Please specify the iSCSI portal IP address: 10.35.146.129 Please specify the iSCSI portal port [3260]: Please specify the iSCSI portal user: The following targets have been found: [1] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 TPGT: 1, portals: 10.35.146.225:3260 [2] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 TPGT: 1, portals: 10.35.146.193:3260 [3] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 TPGT: 1, portals: 10.35.146.161:3260 [4] iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 TPGT: 1, portals: 10.35.146.129:3260 Please select a target (1, 2, 3, 4) [1]: [ INFO ] Connecting to the storage server The following luns have been found on the requested target: [1] LUN1 3514f0c5a516016d9 70GiB XtremIO XtremApp status: free, paths: 1 active [2] LUN2 3514f0c5a516016da 80GiB XtremIO XtremApp status: free, paths: 1 active Please select the destination LUN (1, 2) [1]: 2 . . . [ INFO ] Engine-setup successfully completed There is inconsistency between how vintage being deployed over iSCSI vs. Node 0. I'm closing this very bug as it is working as expected with regards to 54GB LUN size. For more information, please follow https://bugzilla.redhat.com/show_bug.cgi?id=1560655