1207155 – Failed to start VM after upgrade from 6.5 to 6.6

Bug 1207155 - Failed to start VM after upgrade from 6.5 to 6.6

Summary: Failed to start VM after upgrade from 6.5 to 6.6

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-node
Sub Component:
Version:	3.5.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	ovirt-3.6.0-rc
Target Release:	3.6.0
Assignee:	Ryan Barry
QA Contact:	cshao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1193058 1211054
TreeView+	depends on / blocked

Reported:	2015-03-30 10:44 UTC by cshao
Modified:	2016-03-09 14:20 UTC (History)
CC List:	14 users (show)
Fixed In Version:	ovirt-node-3.3.0-0.4.20150906git14a6024.el7ev
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1211054 (view as bug list)
Environment:
Last Closed:	2016-03-09 14:20:15 UTC
oVirt Team:	Node
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
vm-failed.tar.gz (591.49 KB, application/x-gzip) 2015-03-30 10:44 UTC, cshao	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1198187	urgent	CLOSED	vdsm log is flooded with cgroup CPUACCT controller is not mounted errors, migration of VMs not possible	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2016:0378	normal	SHIPPED_LIVE	ovirt-node bug fix and enhancement update for RHEV 3.6	2016-03-09 19:06:36 UTC
oVirt gerrit	39376	master	MERGED	Allow ovirt_t to transition to unconfined_t for ovirt-post	Never
oVirt gerrit	39478	master	MERGED	init: Run handlers as unconfied_t	Never
oVirt gerrit	39479	ovirt-3.5	MERGED	init: Run handlers as unconfied_t	Never
oVirt gerrit	39487	ovirt-3.5	MERGED	Allow ovirt_t to transition to unconfined_t for ovirt-post	Never

Internal Links: 1198187

Description cshao 2015-03-30 10:44:21 UTC

Created attachment 1008346 [details]
vm-failed.tar.gz

Description of problem:
Failed to start VM after upgrade from 6.5 to 6.6

"VM cshao_vm1 is down with error. Exit message: internal error Cannot parse sensitivity level in s0"

Version-Release number of selected component (if applicable):
RHEVH 6.5 20150115
rhev-hypervisor6-6.6-20150327.0
ovirt-node-3.2.2-1.el6.noarch
RHEV-M VT14.1 (3.5.1-0.2.el6ev)

How reproducible:
100%

Steps to Reproduce:
1. Installed RHEVH 6.5 20150115 with dhcp network, 
2. Add rhevh via rhevm portal into 3.4 compatibility version in rhevm 3.5.1-0.2.el6ev
4. Created 1 VM running on it successful.
5. Shutdown the 1 VM, then maintenance the rhevh 6.5 host.
6. Upgrade rhevh 6.5 via rhevm portal to rhev-hypervisor6-6.6-20150327.0
7. Start the VM

Actual results:
1. Failed to start VM after upgrade from 6.5 to 6.6.

Expected results:
Start VM can succeed after upgrade from 6.5 to 6.6.

Additional info:

Comment 1 cshao 2015-03-30 10:47:30 UTC

Consider to this is a regression bug due to no such issue when upgrade from rhevh-6.5-20150115 to rhevh-6.6-0128.

Comment 2 Fabian Deutsch 2015-03-30 11:32:19 UTC

I've seen the same error when libvirtd got started with the wrong SELinux context. And this is probably caused because libvirtd might get indirectly restarted by vdsm-tool in one of the on-boot hooks.

If this is the cause, then I see the following solutions:
1. If vdsm is restarting libvirt: Stop libvirt in that hook, and let vdsm restart it, then it should get the right context
2. Allow the transition from ovirt_t to virtd_t

The general problem I see is that we might have more (invisible) incorrect SELinux contextes, but that this is bug is currently the only visible one.

Comment 4 Fabian Deutsch 2015-03-30 13:34:58 UTC

An idea would be to run the ovirt-post service in the initd_t context, then all transition rulkes for initd_t would apply as well and we do not need custom ones.

Comment 10 Marina Kalinin 2015-04-02 23:21:06 UTC

Hi,

I have a customer failing to migrate VMs to such a host (upgraded to - 20150128.0.el6ev) with this only error in vdsm.log:
~~~
VM Channels Listener::DEBUG::2015-03-10 12:02:01,568::guestagent::217::vm.Vm::(_connect) vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Connection attempt failed: [Errno 9] Bad file descriptor
VM Channels Listener::DEBUG::2015-03-10 12:02:34,194::guestagent::201::vm.Vm::(_connect) vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Attempting connection to /var/lib/libvirt/qemu/channels/8f9b6fe7-0aae-4816-9669-8abc4e20a8e9.com.redhat.rhevm.vdsm
~~~

Can this be related to this bug or it is a new problem?
What is expected in vdsm.log to identify the problem described in this bug?

Comment 11 Ryan Barry 2015-04-02 23:39:15 UTC

(In reply to Marina from comment #10)
> Hi,
> 
> I have a customer failing to migrate VMs to such a host (upgraded to -
> 20150128.0.el6ev) with this only error in vdsm.log:
> ~~~
> VM Channels Listener::DEBUG::2015-03-10
> 12:02:01,568::guestagent::217::vm.Vm::(_connect)
> vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Connection attempt failed:
> [Errno 9] Bad file descriptor
> VM Channels Listener::DEBUG::2015-03-10
> 12:02:34,194::guestagent::201::vm.Vm::(_connect)
> vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Attempting connection to
> /var/lib/libvirt/qemu/channels/8f9b6fe7-0aae-4816-9669-8abc4e20a8e9.com.
> redhat.rhevm.vdsm
> ~~~
> 
> Can this be related to this bug or it is a new problem?
> What is expected in vdsm.log to identify the problem described in this bug?

That looks like a different bug entirely. This bug can be identified by checking the selinux context of libvirtd, which should be virtd_t, but was improperly set as ovirt_d.

If libvirtd is running as virtd_t, please file a new bug against vdsm.

Comment 12 cshao 2015-04-03 05:43:14 UTC

Test version:
RHEVH 6.5 20150115
rhev-hypervisor6-6.6-20150402.0
ovirt-node-3.2.2-3.el6.noarch
RHEV-M VT14.2 (3.5.1-0.3.el6ev)

Test steps:
1. Installed RHEVH 6.5 20150115 with dhcp network, 
2. Add rhevh via rhevm portal into 3.4 compatibility version in rhevm
4. Created 1 VM running on it successful.
5. Shutdown the 1 VM, then maintenance the rhevh 6.5 host.
6. Upgrade rhevh 6.5 via rhevm portal to rhev-hypervisor6-6.6-20150402.0
7. Start the VM

Actual results:
Start VM can succeed after upgrade from 6.5 to 6.6.

So the bug is fixed by above build. I will verify this bug after status change to ON_QA.

Thanks!

Comment 13 Marina Kalinin 2015-04-03 15:47:35 UTC

>    This bug can be identified by
> checking the selinux context of libvirtd, which should be virtd_t, but was
> improperly set as ovirt_d.
> 
> If libvirtd is running as virtd_t, please file a new bug against vdsm.
Thank you, Ryan.
I think you meant if libvirtd is NOT running as virtd_t.
And the way to confirm it is 

# ps -eZ |grep libvirtd
system_u:system_r:virtd_t:s0-s0:c0.c1023 30858 ? 00:03:56 libvirtd

Comment 14 Ying Cui 2015-04-07 13:50:46 UTC

This bug is targeted rhev 3.6.0, not rhev 3.5.1, so move the status back to MODIFIED. Let's wait zstream clone firstly, then make it to ON_QA. Thanks.

Comment 19 cshao 2015-10-28 09:01:01 UTC

The bug is blocked by bug 1275956, I will verify this bug after bug 1275956 fixed.

Comment 20 cshao 2015-11-23 08:11:07 UTC

Test version:
rhev-hypervisor7-7.1-20151015.0
rhev-hypervisor7-7.2-20151112.1
ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
RHEV-M vt 18.2 (3.5.6.2-0.1.el6ev) 

Test steps:
1. Installed RHEVH 7-7.1-20151015.0
2. Add rhevh via rhevm portal into 3.5 compatibility version in rhevm3.5
4. Created 1 VM running on it successful.
5. Shutdown the 1 VM, then maintenance the rhevh 7.1 host.
6. Upgrade rhevh 7.1 via rhevm portal to rhev-hypervisor7-7.2-20151112.1
7. Start the VM.

Test result:
Start VM can succeed after upgrade from 7.1 to 7.2 via RHEV-M.

So the bug is fixed, change bug status to VERIFIED.

Comment 22 errata-xmlrpc 2016-03-09 14:20:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html

Note You need to log in before you can comment on or make changes to this bug.