Bug 1215623

Summary: [BLOCKED] Deploying on a shared block device, hosted-engine is unable to add the VM Disk as a direct LUN
Product: Red Hat Enterprise Virtualization Manager Reporter: Artyom <alukiano>
Component: ovirt-hosted-engine-setupAssignee: Allon Mureinik <amureini>
Status: CLOSED DUPLICATE QA Contact: Elad <ebenahar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.6.0CC: acanan, ahino, alukiano, amureini, ebenahar, ecohen, gklein, lsurette, mlipchuk, sbonazzo, stirabos, tnisan, ylavi
Target Milestone: ovirt-3.6.3Keywords: Regression, TestBlocker
Target Release: 3.6.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-06-03 08:13:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1220824, 1222010, 1222058, 1223482, 1242215    
Bug Blocks: 1002454, 1036731, 1101553, 1150087, 1153278, 1155637, 1167074, 1169290    
Attachments:
Description Flags
setup log
none
new log include engine log none

Description Artyom 2015-04-27 10:45:43 UTC
Created attachment 1019292 [details]
setup log

Description of problem:
Failed to deploy HE environment with ISCSI storage
Deployment failed on stage:
[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
          Enter the name of the cluster to which you want to add the host (Default) [Default]: 
[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...
[ INFO  ] The VDSM Host is now operational
[ ERROR ] Cannot add the Hosted Engine VM Disk to the engine
[ ERROR ] Failed to execute stage 'Closing up': Cannot add the Hosted Engine VM Disk to the engine
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150427074023.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150401110307.git9665976.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Run hosted-engine --deploy with ISCSI storage
2. Continue until you receive error above
3.

Actual results:
Deployment failed

Expected results:
Deployment success without any errors

Additional info:

Comment 1 Allon Mureinik 2015-04-28 07:44:16 UTC
Flagging as integration for initial research, we'll take to storage if needed.

Comment 2 Simone Tiraboschi 2015-05-04 11:45:54 UTC
Artyom,
could you please attach engine logs from that engine VM?

What are you installing on the engine VM?

Thanks

Comment 3 Artyom 2015-05-04 15:28:39 UTC
Created attachment 1021785 [details]
new log include engine log

On engine vm I have:
ovirt-engine-3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch
Red Hat Enterprise Linux Server release 6.6 (Santiago)
2.6.32-504.el6.x86_64

Comment 4 Simone Tiraboschi 2015-05-11 10:48:20 UTC
*** Bug 1220152 has been marked as a duplicate of this bug. ***

Comment 6 Sandro Bonazzola 2015-05-14 15:39:42 UTC
2015-05-04 18:23:10,843 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVdsCommand] (DefaultQuartzScheduler_Worker-85) [] START, FullListVdsCommand(HostName = , HostId = f3ec897f-43b1-47e7-99a9-79f059ddeaaa, vds=Host[,f3ec897f-43b1-47e7-99a9-79f059ddeaaa], vmIds=[bfea96d6-8871-424f-a059-1de921882530]), log id: 6acb4e15
2015-05-04 18:23:10,855 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVdsCommand] (DefaultQuartzScheduler_Worker-85) [] FINISH, FullListVdsCommand, return: [{guestFQDN=, emulatedMachine=rhel6.5.0, pid=5152, guestDiskMapping={}, displaySecurePort=-1, cpuType=Opteron_G5, pauseCode=NOERR, smp=2, vmType=kvm, memSize=4096, vmName=HostedEngine, username=Unknown, clientIp=, vmId=bfea96d6-8871-424f-a059-1de921882530, displayIp=0, displayPort=5900, spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir, nicModel=rtl8139,pv, devices=[Ljava.lang.Object;@3caf6a9b, status=Up, guestIPs=, display=vnc}], log id: 6acb4e15
2015-05-04 18:23:10,857 INFO  [org.ovirt.engine.core.vdsbroker.VmsMonitoring] (DefaultQuartzScheduler_Worker-85) [] Importing VM 'HostedEngine' as 'HostedEngine', as it is running on the on Host, but does not exist in the engine.
2015-05-04 18:23:10,942 INFO  [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (DefaultQuartzScheduler_Worker-85) [1c4d795d] Lock Acquired to object 'EngineLock [exclusiveLocks= key: HostedEngine value: VM_NAME
, sharedLocks= ]'
2015-05-04 18:23:10,994 WARN  [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (DefaultQuartzScheduler_Worker-85) [1c4d795d] CanDoAction of action 'AddVmFromScratch' failed for user null@N/A. Reasons: VAR__ACTION__ADD,VAR__TYPE__VM,ACTION_TYPE_FAILED_IMAGE_REPOSITORY_NOT_FOUND
2015-05-04 18:23:10,995 INFO  [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (DefaultQuartzScheduler_Worker-85) [1c4d795d] Lock freed to object 'EngineLock [exclusiveLocks= key: HostedEngine value: VM_NAME
, sharedLocks= ]'

Tal, can you check why this is failing?

Comment 8 Maor 2015-05-15 12:13:13 UTC
Thanks for the indication in the logs Sandro.

The validation ACTION_TYPE_FAILED_IMAGE_REPOSITORY_NOT_FOUND, is validating if there is an active Data Center with status UP to add the VM to.
This validation was merged recently with change-Id: I2586b0026c85f053b6cd0aebeb555d760afc0937.

Could it be that this failing installation caused by that?
Sando, correct me if I'm wrong, I remember that the hosted engine doesn't use any Data Center in the engine, or there is some kind of special behavior there?

Comment 9 Maor 2015-05-15 12:17:02 UTC
Please remove the security keyword, which was added by mistake, from the bug

Thanks,
Maor

Comment 10 Sandro Bonazzola 2015-05-15 12:28:36 UTC
Trying, but I think only security grup can now remove it.

Comment 11 Sandro Bonazzola 2015-05-15 12:34:15 UTC
Maor, hosted engine creates it's own monitored storage domain, without a storage pool. It creates the VM using an image in this storage domain and install the engine inside it.
At this point, it try to add the host to the engine and fails with reported error.

We've bug #1215158 for getting the possibility to let the engine know about this storage domain and show it in the web ui.
Looks like solving it will fix this bug as well.

Comment 12 Simone Tiraboschi 2015-05-15 13:22:46 UTC
We were overlapping two related but distinct ones.

I created a new one to track what happens when hosted-engine tries to add the engine VM to the engine: https://bugzilla.redhat.com/show_bug.cgi?id=1222010
Now it fails cause there is no active storage domain and the engine try to enforce it.
This is not dependent from the storage type.
The hosted engine VM doesn't got shown in the engine but the setup concludes correctly.

After that step, only for shared block storage devices (currently iSCSI and FC), we try to add the LUN were we deployed the hosted engine VM as a direct LUN to prevent any further usage of that from the engine (otherwise all the setup will got destroyed).
This bug is relative only to iSCSI and FC.
In this case, failing to add the LUN as a direct LUN the whole setup is failing with:
[ ERROR ] Cannot add the Hosted Engine VM Disk to the engine
[ ERROR ] Failed to execute stage 'Closing up': Cannot add the Hosted Engine VM Disk to the engine

Comment 13 Maor 2015-05-16 07:41:01 UTC
Hi Simone,

Thanks for the input.
Does BZ1222010 and BZ1222058 are those two scenarios you described at https://bugzilla.redhat.com/show_bug.cgi?id=1215623#c12 - If so can we please close this one?
if not, can we please change the summary of this bug to the relevant issue.

Thanks,
Maor

Comment 14 Simone Tiraboschi 2015-05-19 11:51:13 UTC
(In reply to Maor from comment #13)
> Does BZ1222010 and BZ1222058 are those two scenarios you described at
> https://bugzilla.redhat.com/show_bug.cgi?id=1215623#c12 - If so can we
> please close this one?
> if not, can we please change the summary of this bug to the relevant issue.

No, unfortunately BZ1222058 is different so we need all the three.

Comment 15 Simone Tiraboschi 2015-05-20 15:31:39 UTC
Ok, found it!

It's just in server.log and nothing in engine.log.
ovirtsdk.api.API.disks.add(disk) now fails due to a java.lang.NullPointerException server side.
The same code is working against an engine 3.5.z.

2015-05-20 17:22:14,659 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/ovirt-engine/api].[org.ovirt.engine.api.restapi.BackendApplication]] (ajp--127.0.0.1-8702-5) Servlet.service() for servlet org.ovirt.engine.api.restapi.BackendApplication threw exception: org.jboss.resteasy.spi.UnhandledException: java.lang.NullPointerException
	at org.jboss.resteasy.core.SynchronousDispatcher.handleApplicationException(SynchronousDispatcher.java:340) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.SynchronousDispatcher.handleException(SynchronousDispatcher.java:214) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.SynchronousDispatcher.handleInvokerException(SynchronousDispatcher.java:190) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.SynchronousDispatcher.getResponse(SynchronousDispatcher.java:540) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:502) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:119) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:208) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:55) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:50) [resteasy-jaxrs-2.3.2.Final.jar:]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:847) [jboss-servlet-api_3.0_spec-1.0.0.Final.jar:1.0.0.Final]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:329) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.api.common.invocation.CurrentFilter.doFilter(CurrentFilter.java:65) [interface-common-jaxrs.jar:]
	at org.ovirt.engine.api.common.invocation.CurrentFilter.doFilter(CurrentFilter.java:47) [interface-common-jaxrs.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.RestApiSessionMgmtFilter.doFilter(RestApiSessionMgmtFilter.java:69) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.EnforceAuthFilter.doFilter(EnforceAuthFilter.java:39) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.LoginFilter.doFilter(LoginFilter.java:74) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.NegotiationFilter.doFilter(NegotiationFilter.java:113) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.BasicAuthenticationFilter.doFilter(BasicAuthenticationFilter.java:90) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.SessionValidationFilter.doFilter(SessionValidationFilter.java:73) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.EngineSessionTokenAuthenticationFilter.doFilter(EngineSessionTokenAuthenticationFilter.java:31) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.core.aaa.filters.RestApiSessionValidationFilter.doFilter(RestApiSessionValidationFilter.java:32) [aaa.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.api.common.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:110) [interface-common-jaxrs.jar:]
	at org.ovirt.engine.api.common.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:101) [interface-common-jaxrs.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.ovirt.engine.api.common.security.CORSSupportFilter.doFilter(CORSSupportFilter.java:182) [interface-common-jaxrs.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489) [jbossweb-7.0.13.Final.jar:]
	at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153) [jboss-as-web-7.1.1.Final.jar:7.1.1.Final]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) [jbossweb-7.0.13.Final.jar:]
	at org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) [jbossweb-7.0.13.Final.jar:]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) [jbossweb-7.0.13.Final.jar:]
	at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505) [jbossweb-7.0.13.Final.jar:]
	at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445) [jbossweb-7.0.13.Final.jar:]
	at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930) [jbossweb-7.0.13.Final.jar:]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_79]
Caused by: java.lang.NullPointerException
	at org.ovirt.engine.api.restapi.resource.BackendDisksResource.getStorageDomainById(BackendDisksResource.java:66) [restapi-jaxrs.jar:]
	at org.ovirt.engine.api.restapi.resource.BackendDisksResource.add(BackendDisksResource.java:38) [restapi-jaxrs.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_79]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_79]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_79]
	at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_79]
	at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:155) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.ResourceMethod.invokeOnTarget(ResourceMethod.java:257) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.ResourceMethod.invoke(ResourceMethod.java:222) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.ResourceMethod.invoke(ResourceMethod.java:211) [resteasy-jaxrs-2.3.2.Final.jar:]
	at org.jboss.resteasy.core.SynchronousDispatcher.getResponse(SynchronousDispatcher.java:525) [resteasy-jaxrs-2.3.2.Final.jar:]
	... 56 more

it's probably related to https://bugzilla.redhat.com/show_bug.cgi?id=1147860

Comment 16 Allon Mureinik 2015-06-03 08:06:05 UTC
(In reply to Simone Tiraboschi from comment #15)

> it's probably related to https://bugzilla.redhat.com/show_bug.cgi?id=1147860
Can you explain?
BZ 1147860 is about direct luns, which you could never move/copy/whatever. What does this have to do with HE?

Comment 17 Simone Tiraboschi 2015-06-03 08:13:06 UTC
Deploying hosted-engine-setup on iSCSI (and now also on FC), we are adding the LUN were we deployed the engine VM as a direct LUN to prevent any other usages of that LUN which could destroy the engine itself.
Please see:
https://bugzilla.redhat.com/show_bug.cgi?id=1157243
https://bugzilla.redhat.com/show_bug.cgi?id=1157238

This one it's indeed just a duplicate of 1220824

Comment 18 Simone Tiraboschi 2015-06-03 08:13:46 UTC

*** This bug has been marked as a duplicate of bug 1220824 ***