Bug 1207155 - Failed to start VM after upgrade from 6.5 to 6.6
Summary: Failed to start VM after upgrade from 6.5 to 6.6
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.5.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Ryan Barry
QA Contact: cshao
URL:
Whiteboard:
Depends On:
Blocks: 1193058 1211054
TreeView+ depends on / blocked
 
Reported: 2015-03-30 10:44 UTC by cshao
Modified: 2016-03-09 14:20 UTC (History)
14 users (show)

Fixed In Version: ovirt-node-3.3.0-0.4.20150906git14a6024.el7ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1211054 (view as bug list)
Environment:
Last Closed: 2016-03-09 14:20:15 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vm-failed.tar.gz (591.49 KB, application/x-gzip)
2015-03-30 10:44 UTC, cshao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1198187 0 urgent CLOSED vdsm log is flooded with cgroup CPUACCT controller is not mounted errors, migration of VMs not possible 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2016:0378 0 normal SHIPPED_LIVE ovirt-node bug fix and enhancement update for RHEV 3.6 2016-03-09 19:06:36 UTC
oVirt gerrit 39376 0 master MERGED Allow ovirt_t to transition to unconfined_t for ovirt-post Never
oVirt gerrit 39478 0 master MERGED init: Run handlers as unconfied_t Never
oVirt gerrit 39479 0 ovirt-3.5 MERGED init: Run handlers as unconfied_t Never
oVirt gerrit 39487 0 ovirt-3.5 MERGED Allow ovirt_t to transition to unconfined_t for ovirt-post Never

Internal Links: 1198187

Description cshao 2015-03-30 10:44:21 UTC
Created attachment 1008346 [details]
vm-failed.tar.gz

Description of problem:
Failed to start VM after upgrade from 6.5 to 6.6

"VM cshao_vm1 is down with error. Exit message: internal error Cannot parse sensitivity level in s0"

Version-Release number of selected component (if applicable):
RHEVH 6.5 20150115
rhev-hypervisor6-6.6-20150327.0
ovirt-node-3.2.2-1.el6.noarch
RHEV-M VT14.1 (3.5.1-0.2.el6ev)

How reproducible:
100%

Steps to Reproduce:
1. Installed RHEVH 6.5 20150115 with dhcp network, 
2. Add rhevh via rhevm portal into 3.4 compatibility version in rhevm 3.5.1-0.2.el6ev
4. Created 1 VM running on it successful.
5. Shutdown the 1 VM, then maintenance the rhevh 6.5 host.
6. Upgrade rhevh 6.5 via rhevm portal to rhev-hypervisor6-6.6-20150327.0
7. Start the VM

Actual results:
1. Failed to start VM after upgrade from 6.5 to 6.6.

Expected results:
Start VM can succeed after upgrade from 6.5 to 6.6.

Additional info:

Comment 1 cshao 2015-03-30 10:47:30 UTC
Consider to this is a regression bug due to no such issue when upgrade from rhevh-6.5-20150115 to rhevh-6.6-0128.

Comment 2 Fabian Deutsch 2015-03-30 11:32:19 UTC
I've seen the same error when libvirtd got started with the wrong SELinux context. And this is probably caused because libvirtd might get indirectly restarted by vdsm-tool in one of the on-boot hooks.

If this is the cause, then I see the following solutions:
1. If vdsm is restarting libvirt: Stop libvirt in that hook, and let vdsm restart it, then it should get the right context
2. Allow the transition from ovirt_t to virtd_t

The general problem I see is that we might have more (invisible) incorrect SELinux contextes, but that this is bug is currently the only visible one.

Comment 4 Fabian Deutsch 2015-03-30 13:34:58 UTC
An idea would be to run the ovirt-post service in the initd_t context, then all transition rulkes for initd_t would apply as well and we do not need custom ones.

Comment 10 Marina Kalinin 2015-04-02 23:21:06 UTC
Hi,

I have a customer failing to migrate VMs to such a host (upgraded to - 20150128.0.el6ev) with this only error in vdsm.log:
~~~
VM Channels Listener::DEBUG::2015-03-10 12:02:01,568::guestagent::217::vm.Vm::(_connect) vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Connection attempt failed: [Errno 9] Bad file descriptor
VM Channels Listener::DEBUG::2015-03-10 12:02:34,194::guestagent::201::vm.Vm::(_connect) vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Attempting connection to /var/lib/libvirt/qemu/channels/8f9b6fe7-0aae-4816-9669-8abc4e20a8e9.com.redhat.rhevm.vdsm
~~~

Can this be related to this bug or it is a new problem?
What is expected in vdsm.log to identify the problem described in this bug?

Comment 11 Ryan Barry 2015-04-02 23:39:15 UTC
(In reply to Marina from comment #10)
> Hi,
> 
> I have a customer failing to migrate VMs to such a host (upgraded to -
> 20150128.0.el6ev) with this only error in vdsm.log:
> ~~~
> VM Channels Listener::DEBUG::2015-03-10
> 12:02:01,568::guestagent::217::vm.Vm::(_connect)
> vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Connection attempt failed:
> [Errno 9] Bad file descriptor
> VM Channels Listener::DEBUG::2015-03-10
> 12:02:34,194::guestagent::201::vm.Vm::(_connect)
> vmId=`8f9b6fe7-0aae-4816-9669-8abc4e20a8e9`::Attempting connection to
> /var/lib/libvirt/qemu/channels/8f9b6fe7-0aae-4816-9669-8abc4e20a8e9.com.
> redhat.rhevm.vdsm
> ~~~
> 
> Can this be related to this bug or it is a new problem?
> What is expected in vdsm.log to identify the problem described in this bug?

That looks like a different bug entirely. This bug can be identified by checking the selinux context of libvirtd, which should be virtd_t, but was improperly set as ovirt_d.

If libvirtd is running as virtd_t, please file a new bug against vdsm.

Comment 12 cshao 2015-04-03 05:43:14 UTC
Test version:
RHEVH 6.5 20150115
rhev-hypervisor6-6.6-20150402.0
ovirt-node-3.2.2-3.el6.noarch
RHEV-M VT14.2 (3.5.1-0.3.el6ev)

Test steps:
1. Installed RHEVH 6.5 20150115 with dhcp network, 
2. Add rhevh via rhevm portal into 3.4 compatibility version in rhevm
4. Created 1 VM running on it successful.
5. Shutdown the 1 VM, then maintenance the rhevh 6.5 host.
6. Upgrade rhevh 6.5 via rhevm portal to rhev-hypervisor6-6.6-20150402.0
7. Start the VM

Actual results:
Start VM can succeed after upgrade from 6.5 to 6.6.

So the bug is fixed by above build. I will verify this bug after status change to ON_QA.

Thanks!

Comment 13 Marina Kalinin 2015-04-03 15:47:35 UTC
>    This bug can be identified by
> checking the selinux context of libvirtd, which should be virtd_t, but was
> improperly set as ovirt_d.
> 
> If libvirtd is running as virtd_t, please file a new bug against vdsm.
Thank you, Ryan.
I think you meant if libvirtd is NOT running as virtd_t.
And the way to confirm it is 

# ps -eZ |grep libvirtd
system_u:system_r:virtd_t:s0-s0:c0.c1023 30858 ? 00:03:56 libvirtd

Comment 14 Ying Cui 2015-04-07 13:50:46 UTC
This bug is targeted rhev 3.6.0, not rhev 3.5.1, so move the status back to MODIFIED. Let's wait zstream clone firstly, then make it to ON_QA. Thanks.

Comment 19 cshao 2015-10-28 09:01:01 UTC
The bug is blocked by bug 1275956, I will verify this bug after bug 1275956 fixed.

Comment 20 cshao 2015-11-23 08:11:07 UTC
Test version:
rhev-hypervisor7-7.1-20151015.0
rhev-hypervisor7-7.2-20151112.1
ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
RHEV-M vt 18.2 (3.5.6.2-0.1.el6ev) 

Test steps:
1. Installed RHEVH 7-7.1-20151015.0
2. Add rhevh via rhevm portal into 3.5 compatibility version in rhevm3.5
4. Created 1 VM running on it successful.
5. Shutdown the 1 VM, then maintenance the rhevh 7.1 host.
6. Upgrade rhevh 7.1 via rhevm portal to rhev-hypervisor7-7.2-20151112.1
7. Start the VM.

Test result:
Start VM can succeed after upgrade from 7.1 to 7.2 via RHEV-M.

So the bug is fixed, change bug status to VERIFIED.

Comment 22 errata-xmlrpc 2016-03-09 14:20:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html


Note You need to log in before you can comment on or make changes to this bug.