Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1280310

Summary: Failed to startup HE-VM after upgrade RHEV-H7.1-20151015 to RHEV-H7.2-20151104 (Device or resource busy)
Product: Red Hat Enterprise Virtualization Manager Reporter: Chaofeng Wu <cwu>
Component: ovirt-hosted-engine-haAssignee: Martin Sivák <msivak>
Status: CLOSED ERRATA QA Contact: Artyom <alukiano>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.6.0CC: cshao, cwu, dfediuck, dougsland, fdeutsch, gklein, huiwa, huzhao, istein, leiwang, lsurette, mavital, mgoldboi, msivak, nsednev, sbonazzo, yaniwang, ycui, ykaul, ylavi
Target Milestone: ovirt-3.6.1Keywords: TestBlocker, Triaged
Target Release: 3.6.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-hosted-engine-ha-1.3.3.6-1.el7ev Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-09 19:50:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1291731    
Bug Blocks: 1264065, 1285700    
Attachments:
Description Flags
sosreport
none
/var/log and sosreport none

Description Chaofeng Wu 2015-11-11 12:55:20 UTC
Created attachment 1092711 [details]
sosreport

Description of problem:
Install and configure HE-VM successful on RHEV-H7.1-20151015, upgrade to RHEV-H7.2-20151104 via cmd line. After upgrade successful, HE-VM can not startup automatically.

Version-Release number of selected component (if applicable):
rhev-hypervisor7-7.1-20151015.0.iso
ovirt-node-3.2.3-23.el7.noarch
rhev-hypervisor7-7.2-20151104.0.iso
ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
rhevm-appliance-20151014.1-1.x86_64.rhevm.ova

How reproducible:
100%

Steps to Reproduce:
1. PXE install rhev-hypervisor7-7.1-20151015.0.iso
2. Install, configure and running HE-VM successful.
3. Upgrade to RHEV-H7.2-20151104 with "upgrade" parameter via cmd line.
initrd=/images/rhevh-vdsm7-7.2-20151104.0_36/initrd0.img ksdevice=bootif rootflags=loop rootflags=ro rd.dm=0 rd_NO_MULTIPATH rd.md=0 crashkernel=256M rootfstype=auto lang=  max_loop=256 rhgb quiet elevator=deadline rd.live.check rd.luks=0 install ro root=live:/rhev-hypervisor7-7.2-20151104.0.iso rd.live.image  BOOTIF=01-5c-f3-fc-e9-c0-c8 upgrade BOOT_IMAGE=/images/rhevh-vdsm7-7.2-20151104.0_36/vmlinuz0

Actual results:
After step3 HE-VM failed to startup.

Expected results:
After step3 HE-VM should startup and running successful.

Additional info:
You can find the some errors in /var/log/agent.log:
[root@ibm-x3650m3-02 ovirt-hosted-engine-ha]# cat agent.log 
... ...
MainThread::INFO::2015-11-11 10:06:24,862::upgrade::165::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_is_conf_volume_there) Found conf volume: imgUUID:2e528651-6262-49bc-abd5-06d036e109c4, volUUID:f0be421a-9466-4cd9-b877-35387f33af70
MainThread::INFO::2015-11-11 10:06:24,992::upgrade::670::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_startMonitoringDomain) Start monitoring domain
MainThread::INFO::2015-11-11 10:06:55,568::upgrade::288::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_get_conffile_content) Reading conf file: hosted-engine.conf
MainThread::ERROR::2015-11-11 10:06:55,639::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Error executing: 1 - stdout: - stderr:mv: cannot backup ‘/etc/ovirt-hosted-engine/hosted-engine.conf’: Device or resource busy
' - trying to restart agent
MainThread::WARNING::2015-11-11 10:07:00,645::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, attempt '9'
MainThread::ERROR::2015-11-11 10:07:00,645::agent::210::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Too many errors occurred, giving up. Please review the log and consider filing a bug.

[root@ibm-x3650m3-02 mnt]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : False
Hostname                           : ibm-x3650m3-02.qe.lab.eng.nay.redhat.com
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 564
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=564 (Wed Nov 11 09:34:41 2015)
	host-id=1
	score=2400
	maintenance=False
	state=EngineStarting

Comment 1 Fabian Deutsch 2015-11-11 13:45:15 UTC
Error: 'Error executing: 1 - stdout: - stderr:mv: cannot backup ‘/etc/ovirt-hosted-engine/hosted-engine.conf’: Device or resource busy

Seems to be around persistence, but maybe that's just one issue preventing the start.

Comment 2 Douglas Schilling Landgraf 2015-11-12 16:11:08 UTC
To me this one is another symptom of bz#1280268. Could you please also share the ovirt-node.log?

Thanks!

Comment 3 Chaofeng Wu 2015-11-16 10:42:23 UTC
Created attachment 1094846 [details]
/var/log and sosreport

Comment 6 Ying Cui 2015-11-16 13:58:48 UTC
Douglas, for this bug we need release note for rhevh 7.2 for 3.6 beta 1.

Comment 7 Doron Fediuck 2015-11-17 11:52:23 UTC
The key issue here is-
10:06:55,639::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Error executing: 1 - stdout: - stderr:mv: cannot backup ‘/etc/ovirt-hosted-engine/hosted-engine.conf’: Device or resource busy

The upgrade process from 3.5 to 3.6 is upgrading the configuration and moving
some parts to the shared storage. So far it was tested on RHEL and we need to ensure this is properly supported in RHEV-H as well.

This is not the same as bug 1280268 since this flow is related to the upgrade
procedure from 3.5 to 3.6.

Comment 8 Fabian Deutsch 2015-11-23 09:59:53 UTC
Moved it over because the code fix around persistence is needed in ovirt-hosted-engine-ha

Comment 11 Fabian Deutsch 2015-12-14 14:39:46 UTC
Please fill in the fixed in version field

Comment 12 Artyom 2015-12-22 15:53:30 UTC
Verified
Upgrade from 3.5 - Red Hat Enterprise Virtualization Hypervisor release 7.1 (20151015.0.el7ev)
==============================================================================================
ovirt-hosted-engine-ha-1.2.7.2-1.el7ev.noarch
ovirt-hosted-engine-setup-1.2.6.1-1.el7ev.noarch
vdsm-4.16.27-1.el7ev.x86_64

to 3.6 - Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151221.1.el7ev)
==============================================================================================
ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch
vdsm-4.17.13-1.el7ev.noarch

1) Install RHEV-H 3.5
2) Deploy hosted-engine on one host with NFS storage
3) Enable global maintenance(on host hosted-engine --set-maintenance --mode=global)
4) Upgrade engine to 3.6
5) Power off engine vm(hosted-engine --vm-poweroff)
6) Upgrade host to RHEV-H 3.6 via usb-key

Upgrade succeed and after host upgrade ovirt-he-agent and ovirt-ha-broker up.

Comment 14 errata-xmlrpc 2016-03-09 19:50:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0422.html