Bug 1168709

Summary: Hosted Engine VM is listed as paused after upgrading from 3.4.4 to 3.5.1 snapshot
Product: [Retired] oVirt Reporter: Sandro Bonazzola <sbonazzo>
Component: vdsmAssignee: Francesco Romani <fromani>
Status: CLOSED CURRENTRELEASE QA Contact: Gil Klein <gklein>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5CC: bazulay, dfediuck, ecohen, gklein, iheim, lsurette, mgoldboi, michal.skrivanek, oourfali, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.5.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: ovirt-3.5.1_rc1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-21 16:03:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1155170    
Attachments:
Description Flags
screenshot
none
log collector report none

Description Sandro Bonazzola 2014-11-27 15:59:38 UTC
Created attachment 962147 [details]
screenshot

On a working 3.4.4 CentOS system I upgraded the host to 3.5.1 snapshot and hit bug #1168689 and bug #1168695

Following http://www.ovirt.org/Hosted_Engine_Howto#Upgrade_Hosted_Engine

I did
 vdsm-tool configure --module libvirt --force
 service vdsmd start
 service ovirt-ha-broker restart && service ovirt-ha-agent restart
 and updated to 3.5 the cluster and the datacenter.

The engine is up and running, the host looks ok, the hosted engine vm is listed as paused.

Comment 1 Sandro Bonazzola 2014-11-27 16:12:58 UTC
Created attachment 962160 [details]
log collector report

Comment 2 Sandro Bonazzola 2014-11-28 10:48:58 UTC
Reproducible on F20 hosts and F19 VM too.

Comment 3 Sandro Bonazzola 2014-12-09 10:40:34 UTC
Michal can you take a look?

Comment 4 Sandro Bonazzola 2014-12-09 10:42:16 UTC
Oved, can you take a look too?

Comment 5 Oved Ourfali 2014-12-09 11:40:00 UTC
(In reply to Sandro Bonazzola from comment #4)
> Oved, can you take a look too?

I'll let Michal have an additional look, as it seems like the host was non-responsive, and someone confirmed it was rebooted, and once it was up again the VM status was moved from Unknown to Paused... apperently the status was "restored", which makes me wonder why it was Paused in the first place.

If we have communication to the host, why can't we get the real status of the VM?

Comment 6 Oved Ourfali 2014-12-09 11:40:07 UTC
(In reply to Sandro Bonazzola from comment #4)
> Oved, can you take a look too?

I'll let Michal have an additional look, as it seems like the host was non-responsive, and someone confirmed it was rebooted, and once it was up again the VM status was moved from Unknown to Paused... apperently the status was "restored", which makes me wonder why it was Paused in the first place.

If we have communication to the host, why can't we get the real status of the VM?

Comment 7 Michal Skrivanek 2014-12-09 12:31:15 UTC
seems there's an error on VM recovery.

Comment 8 Francesco Romani 2014-12-09 13:32:06 UTC
ok, I think I got it.

Graphics Devices may be not sent by Engine. To cover that VDSM reconstructs them from other VM configuration data - which is safe to do since the information is there, just organized in a different way.

The recovery path skips that step, so the configuration is inconsistent.
Later in the creation path the configuration is assumed to be present (either fixed by VDSM or given by Engine), but it is not, and this makes the domain initialization fail in the last stages.

Patch coming soon.

Comment 9 Sandro Bonazzola 2015-01-15 14:25:42 UTC
This is an automated message: 
This bug should be fixed in oVirt 3.5.1 RC1, moving to QA

Comment 10 Sandro Bonazzola 2015-01-21 16:03:02 UTC
oVirt 3.5.1 has been released. If problems still persist, please make note of it in this bug report.