Bug 1267807

Summary: [RFE] HostedEngine - support for multiple iscsi targets
Product: Red Hat Enterprise Virtualization Manager Reporter: Marcus West <mwest>
Component: ovirt-hosted-engine-setupAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED ERRATA QA Contact: Nikolai Sednev <nsednev>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.5.4CC: bburmest, ecohen, lsurette, mavital, michal.skrivanek, sbonazzo, ylavi
Target Milestone: ovirt-4.2.0Keywords: FutureFeature, Reopened, Triaged
Target Release: 4.2.0Flags: mavital: testing_plan_complete+
Hardware: All   
OS: Linux   
Whiteboard: integration
Fixed In Version: Doc Type: Enhancement
Doc Text:
All portals of an iSCSI portal group are connected to enable iSCSI multipath, saving IP and port values in a string.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-15 17:32:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1149579, 1512534, 1536055    
Bug Blocks: 1458709    

Description Marcus West 2015-10-01 05:59:39 UTC
## Description of problem:

Customer would like to utilize HostedEngine on iSCSI with multiple (IP) interfaces.  Currently, it's only possible to configure one.

## Version-Release number of selected component (if applicable):

rhevm-3.5.4.2-1.3.el6ev.noarch

## How reproducible:

Always

Comment 1 Sandro Bonazzola 2015-10-05 12:18:34 UTC
Thanks for reporting, issue seems to be already covered by bug #1193961.
Closing as duplicate.

*** This bug has been marked as a duplicate of bug 1193961 ***

Comment 2 Simone Tiraboschi 2015-12-03 14:43:48 UTC

*** This bug has been marked as a duplicate of bug 1149579 ***

Comment 5 Nikolai Sednev 2017-12-25 12:47:01 UTC
I've deployed SHE on two ha-hosts, with single NFS data storage domain, over iSCSI, while was using one portal with two NICs on FreeNAS as target, deploymet was successful and once at least on IP of a target was provided, additional path (10.35.72.53:3260) was discovered automatically during deployment as appears bellow:

[ INFO  ] Stage: Environment customization
         
          --== STORAGE CONFIGURATION ==--
         
          Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: iscsi
          Please specify the iSCSI portal IP address: 10.35.72.52
          Please specify the iSCSI portal port [3260]: 
          Please specify the iSCSI portal user: 
          The following targets have been found:
                [1]     iqn.2005-10.org.freenas.ctl:4he3iwscsitarget
                        TPGT: 1, portals:
                                10.35.72.52:3260
                                10.35.72.53:3260
         
          Please select a target (1) [1]: 
[ INFO  ] Connecting to the storage server
          The following luns have been found on the requested target:
                [1]     36589cfc0000000c55aa7e1416b12a8fc       88GiB   FreeNAS iSCSI Disk
                        status: free, paths: 2 active
         
          Please select the destination LUN (1) [1]: 
[ INFO  ] Connecting to the storage server

Moving to verified.

Tested on these components:
rhvm-appliance-4.2-20171207.0.el7.noarch
sanlock-3.5.0-1.el7.x86_64
ovirt-host-deploy-1.7.0-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.2-1.el7ev.noarch
vdsm-4.20.9.3-1.el7ev.x86_64
ovirt-hosted-engine-setup-2.2.2-1.el7ev.noarch
mom-0.5.11-1.el7ev.noarch
libvirt-client-3.2.0-14.el7_4.5.x86_64
ovirt-host-4.2.0-1.el7ev.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.13.x86_64
Linux version 3.10.0-693.15.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Dec 14 05:13:32 EST 2017
Linux 3.10.0-693.15.1.el7.x86_64 #1 SMP Thu Dec 14 05:13:32 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.4 (Maipo)


After deployment I've also verified proper ha behavior and it worked as expected.

puma18 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma18
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 9656e9f2
local_conf_timestamp               : 8092
Host timestamp                     : 8090
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=8090 (Mon Dec 25 14:16:57 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=8092 (Mon Dec 25 14:16:59 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : puma19
Host ID                            : 2
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 7151579e
local_conf_timestamp               : 7994
Host timestamp                     : 7992
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7992 (Mon Dec 25 14:16:46 2017)
        host-id=2
        score=3400
        vm_conf_refresh_time=7994 (Mon Dec 25 14:16:49 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineDown
        stopped=False


Both paths are up and active by default on both ha-hosts:
puma18 ~]# iscsiadm -m session
tcp: [1] 10.35.72.52:3260,1 iqn.2005-10.org.freenas.ctl:4he3iwscsitarget (non-flash)
tcp: [2] 10.35.72.53:3260,1 iqn.2005-10.org.freenas.ctl:4he3iwscsitarget (non-flash)

puma18 ~]#  multipath -ll
36589cfc0000000c55aa7e1416b12a8fc dm-0 FreeNAS ,iSCSI Disk      
size=88G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:0:0 sdc 8:32 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 6:0:0:0 sdb 8:16 active ready running

puma19 ~]# iscsiadm -m session
tcp: [1] 10.35.72.52:3260,1 iqn.2005-10.org.freenas.ctl:4he3iwscsitarget (non-flash)
tcp: [2] 10.35.72.53:3260,1 iqn.2005-10.org.freenas.ctl:4he3iwscsitarget (non-flash)

puma19 ~]#  multipath -ll
36589cfc0000000c55aa7e1416b12a8fc dm-0 FreeNAS ,iSCSI Disk      
size=88G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 6:0:0:0 sdb 8:16 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:0:0 sdc 8:32 active ready running


Tested ha on puma18 (ha-host on which SHE-VM was deployed first and was running), I've tried to block active path 10.35.72.53:3260 and observed the results, fail over worked nicely, and second path 10.35.72.52:3260 became active, and engine remained up and running on puma18:
puma18 ~]#  multipath -ll
36589cfc0000000c55aa7e1416b12a8fc dm-0 FreeNAS ,iSCSI Disk      
size=88G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 7:0:0:0 sdc 8:32 failed faulty running
`-+- policy='service-time 0' prio=1 status=active
  `- 6:0:0:0 sdb 8:16 active ready running
puma18 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma18
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 45053952
local_conf_timestamp               : 9507
Host timestamp                     : 9504
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9504 (Mon Dec 25 14:40:30 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=9507 (Mon Dec 25 14:40:33 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma19
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : acf0b444
local_conf_timestamp               : 9417
Host timestamp                     : 9414
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9414 (Mon Dec 25 14:40:29 2017)
        host-id=2
        score=3400
        vm_conf_refresh_time=9417 (Mon Dec 25 14:40:32 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineDown
        stopped=False


Blocked second path 10.35.72.52:3260 and first path 10.35.72.53:3260, then SHE-VM got migrated to puma19:
puma18 ~]#  multipath -ll
36589cfc0000000c55aa7e1416b12a8fc dm-0 FreeNAS ,iSCSI Disk      
size=88G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 7:0:0:0 sdc 8:32 failed faulty running
`-+- policy='service-time 0' prio=0 status=enabled
  `- 6:0:0:0 sdb 8:16 failed faulty running
puma19 ~]#  multipath -ll
36589cfc0000000c55aa7e1416b12a8fc dm-0 FreeNAS ,iSCSI Disk      
size=88G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 6:0:0:0 sdb 8:16 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:0:0 sdc 8:32 active ready running
puma19 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : puma18
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 3bf71c33
local_conf_timestamp               : 9643
Host timestamp                     : 9640
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9640 (Mon Dec 25 14:42:47 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=9643 (Mon Dec 25 14:42:50 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma19
Host ID                            : 2
Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "up", "detail": "WaitForLaunch"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 30e86f03
local_conf_timestamp               : 9660
Host timestamp                     : 9657
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9657 (Mon Dec 25 14:44:31 2017)
        host-id=2
        score=3400
        vm_conf_refresh_time=9660 (Mon Dec 25 14:44:34 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineStarting
        stopped=False

puma19 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : puma18
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 3bf71c33
local_conf_timestamp               : 9643
Host timestamp                     : 9640
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9640 (Mon Dec 25 14:42:47 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=9643 (Mon Dec 25 14:42:50 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma19
Host ID                            : 2
Engine status                      : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : a383c8b0
local_conf_timestamp               : 9717
Host timestamp                     : 9714
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9714 (Mon Dec 25 14:45:28 2017)
        host-id=2
        score=3400
        vm_conf_refresh_time=9717 (Mon Dec 25 14:45:32 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineStarting
        stopped=False

puma19 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : puma18
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 3bf71c33
local_conf_timestamp               : 9643
Host timestamp                     : 9640
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9640 (Mon Dec 25 14:42:47 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=9643 (Mon Dec 25 14:42:50 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma19
Host ID                            : 2
Engine status                      : {"health": "good", "vm": "up", "detail": "Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 02524a7e
local_conf_timestamp               : 9764
Host timestamp                     : 9760
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=9760 (Mon Dec 25 14:46:15 2017)
        host-id=2
        score=3400
        vm_conf_refresh_time=9764 (Mon Dec 25 14:46:18 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False

Comment 8 errata-xmlrpc 2018-05-15 17:32:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1471

Comment 9 Franta Kust 2019-05-16 13:04:37 UTC
BZ<2>Jira Resync