Bug 1411326

Summary: Fail to deploy host using Microsoft iSCSI target server
Product: [oVirt] ovirt-hosted-engine-setup Reporter: hlmasterchief93
Component: Plugins.BlockAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: medium Docs Contact:
Priority: unspecified    
Version: ---CC: amureini, bugs, hlmasterchief93, stirabos, tnisan, ylavi
Target Milestone: ovirt-4.2.2Keywords: TestOnly, Triaged
Target Release: 2.2.16Flags: nsednev: needinfo-
rule-engine: ovirt-4.2+
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-04 10:48:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
fail log
none
/var/log/messages
none
sosreport from host
none
Initiators none

Description hlmasterchief93 2017-01-09 13:37:53 UTC
Created attachment 1238774 [details]
fail log

Description of problem:
Using Windows Server 2016 built-in iSCSI target server, hosted-engine --deploy fail at Creating Storage Domain with this message "Cannot deactivate Logical Volume"

Switch to StarWind Virtual SAN and can successfully pass step "Creating Storage Pool" before problem with qxl (https://bugzilla.redhat.com/show_bug.cgi?id=1411318)

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.1.0-0.0.master.20161221071740.git46cacd3.fc24.noarch
ovirt-release41-pre-4.1.0-0.6.beta2.20161221025826.gitc487776.el7.centos.noarch
fedora 24 server
kernel 4.8.15-200.fc24.x86_64
Windows Server 2016

How reproducible:
Always

Steps to Reproduce:
1. Install Fedora 24
2. Upgrade
3. Install ovirt-release41-pre
4. Install ovirt-hosted-engine-setup
5. Install ovirt-engine-appliance-4.1-20161222.1.el7.centos.noarch.rpm
6. Install python2-dnf (https://bugzilla.redhat.com/show_bug.cgi?id=1358339)
7. Run hosted-engine --deploy and using iscsi storage with Windows Server 2016 built-in iSCSI target server

Actual results:
[ ERROR ] Failed to execute stage 'Misc configuration': Cannot deactivate Logical Volume: ('General Storage Exception: (\'5 [] [\\\'  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!\\\', \\\'  /dev/mapper/360003ff44dc75adc9423173a0564b08d: read failed after 0 of 4096 at 0: Input/output error\\\', \\\'  /dev/mapper/360003ff44dc75adc9423173a0564b08d: read failed after 0 of 4096 at 107374116864: Input/output error\\\', \\\'  /dev/mapper/360003ff44dc75adc9423173a0564b08d: read failed after 0 of 4096 at 107374174208: Input/output error\\\', \\\'  WARNING: Error counts reached a limit of 3. Device /dev/mapper/360003ff44dc75adc9423173a0564b08d was disabled\\\', \\\'  WARNING: Error counts reached a limit of 3. Device /dev/mapper/360003ff44dc75adc9423173a0564b08d was disabled\\\', \\\'  Volume group "f7c7935c-a42f-43c0-8816-ba1c8b2f050e" not found\\\', \\\'  Cannot process volume group f7c7935c-a42f-43c0-8816-ba1c8b2f050e\\\']\\nf7c7935c-a42f-43c0-8816-ba1c8b2f050e/[\\\'master\\\']\',)',)


Expected results:
[ INFO  ] Hosted Engine successfully deployed

Additional info:

Comment 1 Sandro Bonazzola 2017-01-09 13:42:14 UTC
Nir,c an you please have a look?

Comment 2 Yaniv Kaul 2017-01-10 08:20:56 UTC
Can you attach /var/log/messages, so we'll understand what the platform issues are?

Comment 3 hlmasterchief93 2017-01-12 18:03:19 UTC
Created attachment 1240046 [details]
/var/log/messages

Comment 4 Sandro Bonazzola 2017-01-13 12:49:24 UTC
Allon, is this a blocker for 4.1? If not, please re-target.

Comment 5 Nir Soffer 2017-01-13 14:19:53 UTC
(In reply to Sandro Bonazzola from comment #4)
I don't think this is a blocker.

Tal, can you retarget this?

Comment 6 Allon Mureinik 2018-04-01 14:00:32 UTC
Simone, is this flow still present with the node-zero deployment in 4.2?

Comment 7 Simone Tiraboschi 2018-04-03 07:07:10 UTC
(In reply to Allon Mureinik from comment #6)
> Simone, is this flow still present with the node-zero deployment in 4.2?

In node-zero we are going to ask the engine, via REST API, to perform all the iSCSI related tasks and we are not directly dealing with iscsiadmn or vdsm.

I don't know if this issue is still there but for sure the new flow is different and more standard.

Comment 8 Allon Mureinik 2018-04-03 12:39:03 UTC
(In reply to Simone Tiraboschi from comment #7)
> (In reply to Allon Mureinik from comment #6)
> > Simone, is this flow still present with the node-zero deployment in 4.2?
> 
> In node-zero we are going to ask the engine, via REST API, to perform all
> the iSCSI related tasks and we are not directly dealing with iscsiadmn or
> vdsm.
> 
> I don't know if this issue is still there but for sure the new flow is
> different and more standard.

So if VDSM supports this iSCI server, I'd assume so would HE.
Moving to QE to verify.

Comment 9 Sandro Bonazzola 2018-04-05 11:48:48 UTC
Can you please re-test within your environment with latest oVirt release (4.2.2)?

Comment 14 Nikolai Sednev 2018-04-26 14:37:39 UTC
Deployment of Node 0 fails over iSCSI target installed on Windows2016:
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "connection": "close", "content": "{\n  \"detail\" : \"Failed discovery of iSCSI targets\",\n  \"reason\" : \"Operation Failed\"\n}", "content_encoding": "identity", "content_type": "application/json", "correlation_id": "506e428c-5a5e-4327-a68b-0105ca49280f", "date": "Thu, 26 Apr 2018 14:23:50 GMT", "json": {"detail": "Failed discovery of iSCSI targets", "reason": "Operation Failed"}, "msg": "Status code was 400 and not [200]: HTTP Error 400: Bad Request", "redirected": false, "server": "Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips", "status": 400, "transfer_encoding": "chunked", "url": "https://nsednev-he-4.scl.lab.tlv.redhat.com/ovirt-engine/api/hosts/cf4ce7c4-1c0e-445c-a32b-6485e728c4d4/iscsidiscover"}

Comment 15 Nikolai Sednev 2018-04-26 14:38:31 UTC
[ ERROR ] Unable to get target list

Comment 16 Nikolai Sednev 2018-04-26 14:45:01 UTC
Created attachment 1427254 [details]
sosreport from host

Comment 17 Simone Tiraboschi 2018-04-26 14:50:56 UTC
2018-04-26 17:23:50,229+03 ERROR [org.ovirt.engine.core.bll.storage.connection.DiscoverSendTargetsQuery] (default task-30) [506e428c-5a5e-4327-a68b-0105ca49280f] Query 'DiscoverSendTargetsQuery' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed in vdscommand to DiscoverSendTargetsVDS, error = Failed discovery of iSCSI targets: u"portal=10.35.70.48:3260, err=(21, [], ['iscsiadm: No portals found'])" (Failed with error iSCSIDiscoveryError and code 475)
2018-04-26 17:23:50,229+03 ERROR [org.ovirt.engine.core.bll.storage.connection.DiscoverSendTargetsQuery] (default task-30) [506e428c-5a5e-4327-a68b-0105ca49280f] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed in vdscommand to DiscoverSendTargetsVDS, error = Failed discovery of iSCSI targets: u"portal=10.35.70.48:3260, err=(21, [], ['iscsiadm: No portals found'])" (Failed with error iSCSIDiscoveryError and code 475)
        at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) [bll.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:]
        at org.ovirt.engine.core.bll.QueriesCommandBase.runVdsCommand(QueriesCommandBase.java:238) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.connection.DiscoverSendTargetsQuery.executeQueryCommand(DiscoverSendTargetsQuery.java:18) [bll.jar:]
        at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:106) [bll.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
        at org.ovirt.engine.core.bll.executor.DefaultBackendQueryExecutor.execute(DefaultBackendQueryExecutor.java:14) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:538) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runQuery(Backend.java:507) [bll.jar:]
        at sun.reflect.GeneratedMethodAccessor123.invoke(Unknown Source) [:1.8.0_171]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_171]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_171]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
        at org.jboss.as.weld.ejb.DelegatingInterceptorInvocationContext.proceed(DelegatingInterceptorInvocationContext.java:92) [wildfly-weld-ejb-7.1.1.GA-redhat-2.jar:7.1.1.GA-redhat-2]
        at org.jboss.weld.interceptor.proxy.WeldInvocationContext.interceptorChainCompleted(WeldInvocationContext.java:98) [weld-core-impl.jar:2.4.3.Final-redhat-1]
        at org.jboss.weld.interceptor.proxy.WeldInvocationContext.proceed(WeldInvocationContext.java:117) [weld-core-impl.jar:2.4.3.Final-redhat-1]
        at org.ovirt.engine.core.common.di.interceptor.LoggingInterceptor.apply(LoggingInterceptor.java:12) [common.jar:]
        at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source) [:1.8.0_171]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_171]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_171]

Comment 20 Nikolai Sednev 2018-05-01 13:53:26 UTC
The connectivity issue from comment #17 with windows2016 target was in initiators list, which had to be created for both SHE-VM and ha-hosts IPs or their IQNs, as appears in the attached screenshot.

Deployment passed successfully on these components:
ovirt-engine-4.2.3.3-0.1.el7.noarch
rhvm-appliance-4.2-20180427.0.el7.noarch
ovirt-hosted-engine-setup-2.2.19-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.11-1.el7ev.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

MS Windows 2016 server was deployed on separate RHV environment on VM.

Comment 21 Nikolai Sednev 2018-05-01 13:53:56 UTC
Created attachment 1429179 [details]
Initiators

Comment 22 Sandro Bonazzola 2018-05-04 10:48:16 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.