Bug 1415407

Summary:

Shutdown to VM with with 2 or more disks fails

Product:

[oVirt] ovirt-engine

Reporter:

Avihai <aefrat>

Component:

BLL.Storage

Assignee:

Liron Aravot <laravot>

Status:

CLOSED NOTABUG

QA Contact:

Raz Tamir <ratamir>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

4.1.0

CC:

aefrat, ahadas, bugs, laravot, pkliczew, tnisan, vfeenstr

Target Milestone:

ovirt-4.1.1

Flags:

rule-engine: ovirt-4.1+

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-02-07 12:45:41 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Storage

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
engine & vdsm logs	none
Engine , SPM + HSM logs	none
guest engine log	none
vdsm (of relevant host) full log attached	none
engine full log	none
debug engine and vdsm logs	none

Description Avihai 2017-01-21 17:26:10 UTC

Created attachment 1243166 [details]
engine & vdsm logs

Description of problem:
shutdown to stateless VM with direct lun disk fails & VM comes back up again.
Event log states "shutdown of VM failed"


Version-Release number of selected component (if applicable):
Engine:
ovirt-engine-4.1.0.1-0.4.master.20170118134729.gitf34da1f.el7.centos.noarch

Vdsm:
4.19.2-1

How reproducible:
So far (did it twice ) its 100%


Steps to Reproduce:
Did the following via WebAdmin GUI :
1.Create VM (named test) from template ( golden 7.2 rhel image) 
2.Create direct lun disk ( named test_disk) 
3.Attach test_disk to VM 
4.Run VM once as stateless - VM is up and all is well.
5.shutdown VM - initiated at 18:52:40

Actual results:
5 minutes (!) afterwards (At 18:57:46 ) event log showed:
"shut down of VM test failed"
Afterwards VM went up again .


Expected results:
shutdown of VM should 


Additional info:
2017-01-21 18:57:46,714+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler6) [30031977] VM '387d8bb4-68ba-4881-8014-742d906a345e'(test) moved from 'PoweringDown' --> 'Up'
2017-01-21 18:57:46,897+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler6) [30031977] EVENT_ID: VM_POWER_DOWN_FAILED(147), Correlation ID: null, Call Stack: null, Custom Event ID: -1,
 Message: Shutdown of VM test failed.

Comment 1 Yaniv Kaul 2017-01-22 07:52:43 UTC

- Did it have a guest agent?
- Anything related to the direct lun? (it works if there's no direct LUN)? 
- Does it happen/ only when it is run as stateless?


Perhaps I'm missing something, but I can't seem to find the run command in vdsm log.

Comment 2 Avihai 2017-01-22 09:19:49 UTC

(In reply to Yaniv Kaul from comment #1)
> - Did it have a guest agent?

- Yes , guest agent is installed 

> - Anything related to the direct lun? (it works if there's no direct LUN)?
- Also without direct lun (regular disk image) it same issue occur .
  I attached logs again of both engine+ vdsm (now you should see the run command)

Engine log:
2017-01-22 11:06:27,258+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler2) [e8df0aa] EVENT_ID: VM_POWER_DOWN_FAILED(147), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Shutdown of VM golden_env_mixed_virtio_3_1 failed.


> - Does it happen/ only when it is run as stateless?
No , this is more severe , it also happens when VM runs not as stateless .


> Perhaps I'm missing something, but I can't seem to find the run command in
> vdsm log.
My guess is as this VM is hosted by another HSM host (host3) & I added the SPM host(host2) as vdsm.log .

Now , I also added the HSM log vdsm_HSM_short.log with other logs so it should all be there  .







-

Comment 3 Avihai 2017-01-22 09:20:42 UTC

Created attachment 1243301 [details]
Engine , SPM + HSM logs

Comment 4 Yaniv Kaul 2017-01-22 09:32:35 UTC

(In reply to Avihai from comment #2)
> (In reply to Yaniv Kaul from comment #1)
> > - Did it have a guest agent?
> 
> - Yes , guest agent is installed 
> 
> > - Anything related to the direct lun? (it works if there's no direct LUN)?
> - Also without direct lun (regular disk image) it same issue occur .
>   I attached logs again of both engine+ vdsm (now you should see the run
> command)
> 
> Engine log:
> 2017-01-22 11:06:27,258+02 WARN 
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (DefaultQuartzScheduler2) [e8df0aa] EVENT_ID: VM_POWER_DOWN_FAILED(147),
> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message:
> Shutdown of VM golden_env_mixed_virtio_3_1 failed.
> 
> 
> > - Does it happen/ only when it is run as stateless?
> No , this is more severe , it also happens when VM runs not as stateless .
> 
> 
> > Perhaps I'm missing something, but I can't seem to find the run command in
> > vdsm log.
> My guess is as this VM is hosted by another HSM host (host3) & I added the
> SPM host(host2) as vdsm.log .

Why guess? Why not check the logs?
How is the SPM related to powering off a VM?
Do you see the relevant command on the VDSM log?

> 
> Now , I also added the HSM log vdsm_HSM_short.log with other logs so it
> should all be there  .
> 
> 
> 
> 
> 
> 
> 
> -

Comment 5 Tal Nisan 2017-01-22 09:59:04 UTC

(In reply to Avihai from comment #2)
> (In reply to Yaniv Kaul from comment #1)
> > - Did it have a guest agent?
> 
> - Yes , guest agent is installed 
> 
> > - Anything related to the direct lun? (it works if there's no direct LUN)?
> - Also without direct lun (regular disk image) it same issue occur .
>   I attached logs again of both engine+ vdsm (now you should see the run
> command)
> 
> Engine log:
> 2017-01-22 11:06:27,258+02 WARN 
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (DefaultQuartzScheduler2) [e8df0aa] EVENT_ID: VM_POWER_DOWN_FAILED(147),
> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message:
> Shutdown of VM golden_env_mixed_virtio_3_1 failed.
> 
> 
> > - Does it happen/ only when it is run as stateless?
> No , this is more severe , it also happens when VM runs not as stateless .
> 
> 
> > Perhaps I'm missing something, but I can't seem to find the run command in
> > vdsm log.
> My guess is as this VM is hosted by another HSM host (host3) & I added the
> SPM host(host2) as vdsm.log .
> 
> Now , I also added the HSM log vdsm_HSM_short.log with other logs so it
> should all be there  .
> 
> 

Please run the minimal reproduction scenario and write the steps to reproduce again, no point in attaching a direct lun and running stateless if both are not needed to reproduce the issue

Comment 6 Avihai 2017-01-22 10:02:11 UTC

(In reply to Yaniv Kaul from comment #4)
> (In reply to Avihai from comment #2)
> > (In reply to Yaniv Kaul from comment #1)
> > > - Did it have a guest agent?
> > 
> > - Yes , guest agent is installed 
> > 
> > > - Anything related to the direct lun? (it works if there's no direct LUN)?
> > - Also without direct lun (regular disk image) it same issue occur .
> >   I attached logs again of both engine+ vdsm (now you should see the run
> > command)
> > 
> > Engine log:
> > 2017-01-22 11:06:27,258+02 WARN 
> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> > (DefaultQuartzScheduler2) [e8df0aa] EVENT_ID: VM_POWER_DOWN_FAILED(147),
> > Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message:
> > Shutdown of VM golden_env_mixed_virtio_3_1 failed.
> > 
> > 
> > > - Does it happen/ only when it is run as stateless?
> > No , this is more severe , it also happens when VM runs not as stateless .
> > 
> > 
> > > Perhaps I'm missing something, but I can't seem to find the run command in
> > > vdsm log.
> > My guess is as this VM is hosted by another HSM host (host3) & I added the
> > SPM host(host2) as vdsm.log .
> 
> Why guess? Why not check the logs?
> How is the SPM related to powering off a VM?
> Do you see the relevant command on the VDSM log?
> 
My bad,
As default I added the Engine+SPM logs to a bug , I noticed it was not there in last remark and added it.

The VM did not re-run again shutdown failed & VM remained up.
What is the relevant command you are searching for that scenario ?

Checking HSM logs I only see prepareImage command at 10:54:16 which is 10 minutes before the issue occurred( see engine log above) and this is the time the VM was activated before the issue occurred.

Comment 7 Yaniv Kaul 2017-01-22 10:04:06 UTC

1. Look for the command to shut it down.
2. Check guest agent logs (I hope there's something there?).

PrepareImage has little to do with shutting down a VM.

Comment 8 Avihai 2017-01-22 11:38:49 UTC

(In reply to Yaniv Kaul from comment #7)
> 1. Look for the command to shut it down.
In vdsm_HSM_short.log I see only:
2017-01-22 11:01:16,161 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call VM.shutdown succeeded in 0.00 seconds (__init__:515)


> 2. Check guest agent logs (I hope there's something there?).
There is a an agent log (ovirt-guest-agent.log) in the VM but I did not see anything I can understand from it - I attached it to this bug for your inspection.

 
> PrepareImage has little to do with shutting down a VM.
I misunderstood your early comment "find the run command in vdsm log" ,
I though you meant you did not find the command that brings the VM back up .. which does not exist - my bad.

Comment 9 Avihai 2017-01-22 11:39:18 UTC

Created attachment 1243324 [details]
guest engine log

Comment 10 Yaniv Kaul 2017-01-22 13:10:16 UTC

(In reply to Avihai from comment #9)
> Created attachment 1243324 [details]
> guest engine log

Vincenz, can you take a look? I find it lacking quite a bit of information (for example, version of the agent).

Comment 11 Avihai 2017-01-22 13:23:32 UTC

Guest agent package info (rpm -qa | grep agent):
rhevm-guest-agent-common-1.0.12-3.el7ev.noarch

Comment 12 Vinzenz Feenstra [evilissimo] 2017-01-23 07:57:58 UTC

(In reply to Yaniv Kaul from comment #10)
> (In reply to Avihai from comment #9)
> > Created attachment 1243324 [details]
> > guest engine log
> 
> Vincenz, can you take a look? I find it lacking quite a bit of information
> (for example, version of the agent).

The only thing I can see there is that the guest agent never gets a shutdown command from VDSM but I also can see that the communication with VDSM is fine.

However in the VDSM logs is something fishy going on - This might be because they are incomplete:


2017-01-22 10:54:01,109 INFO  (jsonrpc/3) [vdsm.api] START destroy args=(<virt.vm.Vm object at 0x374b3d0>, 1) kwargs={} (api:37)
2017-01-22 10:54:01,110 INFO  (jsonrpc/3) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') Release VM resources (vm:4092)
2017-01-22 10:54:01,116 INFO  (jsonrpc/3) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') Stopping connection (guestagent:430)
2017-01-22 10:54:01,116 INFO  (jsonrpc/3) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') _destroyVmGraceful attempt #0 (vm:4128)
2017-01-22 10:54:01,454 INFO  (jsonrpc/3) [dispatcher] Run and protect: teardownImage(sdUUID='08ae4631-d423-4b10-990b-1cd7c8274f03', spUUID='49479b26-a0d5-46ee-b14b-dcf9dc15980a', imgUUID='5ab76f56-b481-449b-98ed-98db55411f65', volUUID=None) (logUtils:49)
2017-01-22 10:54:01,456 INFO  (jsonrpc/3) [dispatcher] Run and protect: teardownImage, Return response: None (logUtils:52)
2017-01-22 10:54:01,456 INFO  (jsonrpc/3) [dispatcher] Run and protect: teardownImage(sdUUID='f48a30ab-b0a5-40b2-a0f6-d844847dbee1', spUUID='49479b26-a0d5-46ee-b14b-dcf9dc15980a', imgUUID='d05a4ecb-c815-4110-a957-727b9cf6e52a', volUUID=None) (logUtils:49)
2017-01-22 10:54:01,461 INFO  (libvirt/events) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') underlying process disconnected (vm:672)
2017-01-22 10:54:01,501 INFO  (jsonrpc/3) [storage.LVM] Deactivating lvs: vg=f48a30ab-b0a5-40b2-a0f6-d844847dbee1 lvs=['de93be07-a81d-49c0-ac97-78a9468c4adf', 'b11d88a4-36e6-4e19-8e6f-d508b754f8b3'] (lvm:1306)
2017-01-22 10:54:01,565 INFO  (jsonrpc/3) [dispatcher] Run and protect: teardownImage, Return response: None (logUtils:52)
2017-01-22 10:54:01,566 INFO  (jsonrpc/3) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') Stopping connection (guestagent:430)
2017-01-22 10:54:01,566 WARN  (jsonrpc/3) [root] File: /var/lib/libvirt/qemu/channels/2559d7fc-974f-42d6-aa9f-412878ca0a1d.com.redhat.rhevm.vdsm already removed (utils:120)
2017-01-22 10:54:01,566 WARN  (jsonrpc/3) [root] File: /var/lib/libvirt/qemu/channels/2559d7fc-974f-42d6-aa9f-412878ca0a1d.org.qemu.guest_agent.0 already removed (utils:120)
2017-01-22 10:54:01,567 INFO  (jsonrpc/3) [dispatcher] Run and protect: inappropriateDevices(thiefId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') (logUtils:49)
2017-01-22 10:54:01,568 INFO  (jsonrpc/3) [dispatcher] Run and protect: inappropriateDevices, Return response: None (logUtils:52)
2017-01-22 10:54:01,568 INFO  (jsonrpc/3) [vdsm.api] FINISH destroy return={'status': {'message': 'Done', 'code': 0}} (api:43)
2017-01-22 10:54:01,569 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call VM.destroy succeeded in 0.45 seconds (__init__:515)
2017-01-22 10:54:01,570 INFO  (libvirt/events) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') Changed state to Down: Admin shut down from the engine (code=6) (vm:1199)
2017-01-22 10:54:01,570 INFO  (libvirt/events) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') Stopping connection (guestagent:430)
2017-01-22 10:54:01,610 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call VM.destroy failed (error 1) in 0.00 seconds (__init__:515)
This failure reply doesn't seem to have a call ----------------------------------------^
Also the logger name is saying jsonrpc/5 so it won't fit to the jsonrpc/3 call - Otherwise there is no additional information  

@Avihai Please add the full logs for vdsm, so we can investigate this further. thanks

Comment 13 Avihai 2017-01-23 08:54:59 UTC

Created attachment 1243525 [details]
vdsm (of relevant host) full log attached

Added is the relevant VDSM full log.

Comment 14 Vinzenz Feenstra [evilissimo] 2017-01-23 09:05:41 UTC

@piotr: Can you please have a look at this, because this doesn't make sense. Everytime there's a destroy RPC call there are two responses sent but it seems like by different threads. Can you check the attachment from c#13?

Thanks.

Comment 15 Piotr Kliczewski 2017-01-23 09:30:40 UTC

Since we reduced number of logs in vdsm significantly I am not able to understand from the logs what happened. Please provide engine log in debug.

Comment 16 Vinzenz Feenstra [evilissimo] 2017-01-23 09:56:39 UTC

@aefrat: Please also put VDSM logging into debug and reproduce this please, my assumption is that this problem is within VDSM only

Comment 17 Avihai 2017-01-23 09:58:47 UTC

Created attachment 1243532 [details]
engine full log

Engine full log attached

Comment 18 Vinzenz Feenstra [evilissimo] 2017-01-23 10:01:03 UTC

Putting needinfo back for c#16

Comment 19 Piotr Kliczewski 2017-01-23 10:43:01 UTC

Provided log is not in debug. Please update:

ovirt-engine.xml.in and change:

      <logger category="org.ovirt" use-parent-handlers="false">
        <level name="DEBUG"/>
        <handlers>

and

      <root-logger>
        <level name="DEBUG"/>
        <handlers>

After the changes please restart the engine.

Comment 20 Avihai 2017-01-23 12:32:30 UTC

(In reply to Piotr Kliczewski from comment #19)
> Provided log is not in debug. Please update:
> 
> ovirt-engine.xml.in and change:
> 
>       <logger category="org.ovirt" use-parent-handlers="false">
>         <level name="DEBUG"/>
>         <handlers>
> 
> and
> 
>       <root-logger>
>         <level name="DEBUG"/>
>         <handlers>
> 
> After the changes please restart the engine.

Issue recreated with both debug Engine + Vdsm logs attached.

New Timeline :
2017-01-23 14:16:48 - VM shutdown initiated
2017-01-23 14:21:56 - Shutdown of VM test_vm failed

From engine_debug.log :
2017-01-23 14:21:56,894+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler6) [] EVENT_ID: VM_POWER_DOWN_FAILED(147), Correlation ID: null, Call Stack: null, Custom Event ID: 
-1, Message: Shutdown of VM test_vm failed.

Comment 21 Avihai 2017-01-23 12:33:27 UTC

Created attachment 1243576 [details]
debug engine and vdsm logs

Comment 22 Piotr Kliczewski 2017-01-23 12:49:13 UTC

Here is the request that engine sent:

2017-01-23 14:09:46,944+02 DEBUG [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.impl.Message] (ForkJoinPool-1-worker-6) [] SEND
destination:jms.topic.vdsm_requests
reply-to:jms.topic.vdsm_responses
content-length:140

{"jsonrpc":"2.0","method":"VM.destroy","params":{"vmID":"2559d7fc-974f-42d6-aa9f-412878ca0a1d"},"id":"d4d0eb86-38a0-4d36-a810-8d01586b003f"}\00

and response:

MESSAGE
content-length:80
destination:jms.topic.vdsm_responses
content-type:application/json
subscription:131632b2-8608-4efd-a749-f8cef74ccc26

{"jsonrpc": "2.0", "id": "d4d0eb86-38a0-4d36-a810-8d01586b003f", "result": true}\00


and here is vdsm side:

2017-01-23 14:09:45,958 DEBUG (jsonrpc/3) [jsonrpc.JsonRpcServer] Calling 'VM.destroy' in bridge with {'vmID': '2559d7fc-974f-42d6-aa9f-412878ca0a1d'} (__init__:532)
2017-01-23 14:09:45,958 DEBUG (jsonrpc/3) [vds] About to destroy VM 2559d7fc-974f-42d6-aa9f-412878ca0a1d (API:317)
2017-01-23 14:09:45,958 INFO  (jsonrpc/3) [vdsm.api] START destroy args=(<virt.vm.Vm object at 0x25b11d0>, 1) kwargs={} (api:37)
2017-01-23 14:09:45,958 DEBUG (jsonrpc/3) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') destroy Called (vm:4182)
2017-01-23 14:09:45,958 DEBUG (jsonrpc/3) [virt.vm] (vmId='2559d7fc-974f-42d6-aa9f-412878ca0a1d') Total desktops after destroy of 2559d7fc-974f-42d6-aa9f-412878ca0a1d is 0 (vm:4178)
2017-01-23 14:09:45,958 INFO  (jsonrpc/3) [vdsm.api] FINISH destroy return={'status': {'message': 'Done', 'code': 0}} (api:43)
2017-01-23 14:09:45,958 DEBUG (jsonrpc/3) [jsonrpc.JsonRpcServer] Return 'VM.destroy' in bridge with True (__init__:557)
2017-01-23 14:09:45,959 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call VM.destroy succeeded in 0.00 seconds (__init__:515)

I do not see any issue with messages. In the engine logs I see (as in c#20) that the WARN message was triggered by Host.getAllVmStats

Comment 23 Avihai 2017-02-02 11:59:27 UTC

Clarification : 
Bug will only occur only if the VM is with 2 disks or more.

Comment 24 Liron Aravot 2017-02-06 10:24:22 UTC

vdsm/engine operates as expected - we need to understand what happenes within the guest
Avihai, 
1. is the guest operational? what's its current status? have you tried to open a console to it?
2. can try to run the scenario again and attach /var/log/messages and dmesg?
3. please also attach the qemu log for the vm from the host (/var/log/libvirt/qemu/VMID.log)

thanks,
Liron

Comment 25 Avihai 2017-02-06 12:43:47 UTC

Liron , 

I think i know the source of this bug which is in fact another bug ,
please tell me if it makes sense to you :

I noticed that when I create a VM from template,
which is a bootable image(rhv40_el72_ge_Disk1 ) that was imported from glance , the VM is created with a disk that IS NOT MARKED "bootable" .

So when I added another disk the VM choose the disk that has NO OS image 
+
the templates disk is not marked bootable 
= VM is activate without OS so it can not shutdown.

So I think this bug should be closed as NOT A BUG & I should open a new bug :
"Creating a VM from template does not mark its OS disk as expected "

What do you think ?

Comment 26 Avihai 2017-02-06 12:47:48 UTC

Also as VM is up without ANY bootable disk there is no way of editing the disks & mark the OS disk (that was created with the VM creation from template) disk as bootable .

So in fact we can not shutdown this VM only power it off .

I think we should add additional logic to prevent shutting down an active VM without a bootable disk to avoid this shutdown failed issue .

Comment 27 Liron Aravot 2017-02-06 16:54:57 UTC

(In reply to Avihai from comment #25)
> Liron , 
> 
> I think i know the source of this bug which is in fact another bug ,
> please tell me if it makes sense to you :
> 
> I noticed that when I create a VM from template,
> which is a bootable image(rhv40_el72_ge_Disk1 ) that was imported from
> glance , the VM is created with a disk that IS NOT MARKED "bootable" .
> 
> So when I added another disk the VM choose the disk that has NO OS image 
> +
> the templates disk is not marked bootable 
> = VM is activate without OS so it can not shutdown.
> 
> So I think this bug should be closed as NOT A BUG & I should open a new bug :
> "Creating a VM from template does not mark its OS disk as expected "
> 
> What do you think ?

It makes sense to me, AFAIK when there's no running OS the behavior of Shutdown is undefined. Arik - you know that field better, am i correct?

(In reply to Avihai from comment #26)
> I think we should add additional logic to prevent shutting down an active VM
> without a bootable disk to avoid this shutdown failed issue .

I don't think that's needed, even if there's a bootable disk we can't know if there's something installed on it or not. Usually vms will have os installed so I'd say we can leave that as is.

Comment 28 Avihai 2017-02-07 06:18:34 UTC

(In reply to Liron Aravot from comment #27)
> (In reply to Avihai from comment #25)
> > Liron , 
> > 
> > I think i know the source of this bug which is in fact another bug ,
> > please tell me if it makes sense to you :
> > 
> > I noticed that when I create a VM from template,
> > which is a bootable image(rhv40_el72_ge_Disk1 ) that was imported from
> > glance , the VM is created with a disk that IS NOT MARKED "bootable" .
> > 
> > So when I added another disk the VM choose the disk that has NO OS image 
> > +
> > the templates disk is not marked bootable 
> > = VM is activate without OS so it can not shutdown.
> > 
> > So I think this bug should be closed as NOT A BUG & I should open a new bug :
> > "Creating a VM from template does not mark its OS disk as expected "
> > 
> > What do you think ?
> 
> It makes sense to me, AFAIK when there's no running OS the behavior of
> Shutdown is undefined. Arik - you know that field better, am i correct?
> 
> (In reply to Avihai from comment #26)
> > I think we should add additional logic to prevent shutting down an active VM
> > without a bootable disk to avoid this shutdown failed issue .
> 
> I don't think that's needed, even if there's a bootable disk we can't know
> if there's something installed on it or not. Usually vms will have os
> installed so I'd say we can leave that as is.

I think that its only logical that the first disk of a VM or a new disk while the prior disks are not bootable will be marked as bootable by default otherwize VM will not boot up with an OS and we'll encounter this shutdown issue once more.

To clarify:
This (New created disk mark as bootable by default) is already implemented for non template VMs(see below) & should be also implemented when creating a VM from template .

Also this logic is already there for non template VMs  : 
1. When creating a new disk to a VM with other non-bootable disks -> 
the "bootable option is checked " (VM disk GUI tab ) by default .

2. When creating a new VM (not from template) without disks & than create a new disk -> the "bootable option is checked " (VM disk GUI tab ) byt default .

What I suggest is :
When creating a VM from template mark the bootable field of its disk to avoid this failed shutdown issue .

Comment 29 Arik 2017-02-07 08:56:21 UTC

(In reply to Avihai from comment #28)
> > It makes sense to me, AFAIK when there's no running OS the behavior of
> > Shutdown is undefined. Arik - you know that field better, am i correct?

Yes, this is a known "issue" - when there's no OS, the call to shutdown puts the VM into PoweringDown status but there's nothing actually going on so the VM is still reported as Up and therefore the VM will switched back to Up. But that's not something we intend to address since there's no real use case for having VMs with no OS

Comment 30 Avihai 2017-02-07 12:41:50 UTC

Arik , Liron , As I said this issue is not a bug and can be closed .

The templeate VM that is created without the "bootable" mark on its first disk cause this to occur . 
Raz already opened a bug on this - bug 1247950

Comment 31 Liron Aravot 2017-02-07 12:46:08 UTC

Thanks Avihai,
let's continue the discussion in there.