Bug 1233825 - VM cannot start after adding a virtio-rng random number device (oVirt 3.5.3)
Summary: VM cannot start after adding a virtio-rng random number device (oVirt 3.5.3)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: ---
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Shmuel Melamud
QA Contact: Nisim Simsolo
URL:
Whiteboard: virt
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-19 14:05 UTC by Patrick Hurrelmann
Modified: 2016-02-10 19:24 UTC (History)
10 users (show)

Fixed In Version: 3.6.0-4_alpha3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-22 13:28:59 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-3.6.0+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
VDSM log of failed start (241.37 KB, text/plain)
2015-06-19 14:05 UTC, Patrick Hurrelmann
no flags Details
backport of gerrit changes 38982 and 40095 (1.80 KB, patch)
2015-06-22 10:28 UTC, Patrick Hurrelmann
no flags Details | Diff
VDSM log of failed start with patch applied (58.16 KB, text/plain)
2015-06-22 10:29 UTC, Patrick Hurrelmann
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 42881 0 master ABANDONED virt: Safe device type check for console device Never
oVirt gerrit 43165 0 master MERGED core: Send 'device' property for RNG device Never
oVirt gerrit 43166 0 master MERGED api: 'device' property for RNG device Never

Description Patrick Hurrelmann 2015-06-19 14:05:33 UTC
Created attachment 1040971 [details]
VDSM log of failed start

Description of problem:
When a random number generator (virtio-rng) is added to an existing VM it won't start any longer. Engine webui shows:

2015-Jun-19, 15:39   Failed to run VM example-vm on Host vhost.
2015-Jun-19, 15:39   VM example-vm is down with error. Exit message: 'device'.
2015-Jun-19, 15:39   VM example-vm was started by user@FreeIPA (Host: vhost).

Version-Release number of selected component (if applicable):
On engine:
otopi.noarch                                         1.3.2-1.el7.centos
otopi-java.noarch                                    1.3.2-1.el7.centos
ovirt-engine.noarch                                  3.5.3.1-1.el7.centos
ovirt-engine-backend.noarch                          3.5.3.1-1.el7.centos
ovirt-engine-cli.noarch                              3.5.0.6-0.1.20141107.gitcf7f1a1.el7.centos
ovirt-engine-dbscripts.noarch                        3.5.3.1-1.el7.centos
ovirt-engine-extension-aaa-ldap.noarch               1.0.2-1.el7
ovirt-engine-extension-aaa-misc.noarch               1.0.0-1.el7
ovirt-engine-extensions-api-impl.noarch              3.5.3.1-1.el7.centos
ovirt-engine-jboss-as.x86_64                         7.1.1-1.el7
ovirt-engine-lib.noarch                              3.5.3.1-1.el7.centos
ovirt-engine-restapi.noarch                          3.5.3.1-1.el7.centos
ovirt-engine-sdk-python.noarch                       3.5.2.1-1.el7.centos
ovirt-engine-setup.noarch                            3.5.3.1-1.el7.centos
ovirt-engine-setup-base.noarch                       3.5.3.1-1.el7.centos
ovirt-engine-setup-plugin-ovirt-engine.noarch        3.5.3.1-1.el7.centos
ovirt-engine-setup-plugin-ovirt-engine-common.noarch 3.5.3.1-1.el7.centos
ovirt-engine-setup-plugin-websocket-proxy.noarch     3.5.3.1-1.el7.centos
ovirt-engine-tools.noarch                            3.5.3.1-1.el7.centos
ovirt-engine-userportal.noarch                       3.5.3.1-1.el7.centos
ovirt-engine-webadmin-portal.noarch                  3.5.3.1-1.el7.centos
ovirt-engine-websocket-proxy.noarch                  3.5.3.1-1.el7.centos
ovirt-guest-agent-common.noarch                      1.0.10-2.el7
ovirt-host-deploy.noarch                             1.3.1-1.el7
ovirt-host-deploy-java.noarch                        1.3.1-1.el7
ovirt-image-uploader.noarch                          3.5.1-1.el7.centos
ovirt-iso-uploader.noarch                            3.5.2-1.el7.centos

On host:
vdsm.x86_64                        4.16.20-0.el7.centos
vdsm-cli.noarch                    4.16.20-0.el7.centos
vdsm-jsonrpc.noarch                4.16.20-0.el7.centos
vdsm-python.noarch                 4.16.20-0.el7.centos
vdsm-python-zombiereaper.noarch    4.16.20-0.el7.centos
vdsm-xmlrpc.noarch                 4.16.20-0.el7.centos
vdsm-yajsonrpc.noarch              4.16.20-0.el7.centos

How reproducible:
Every time

Steps to Reproduce:
1. Enable Random Number Generator on cluster level
2. Enable Random Number Generator on VM
3. Start VM

Actual results:
VM does not start. It fails with error "Exit message 'device'"

Expected results:
VM is started and has a virtio-rng device attached.


Additional info:
Logfile from VM startup ist attached (vdsm.log).

According to discussion in #vdsm the problem is that the device specification misses the 'device' attribute, and VDSM depends on it. There are quite some places on which VDSM checks for this attribute, so it is hard and counterproductive to workaround it in VDSM

Comment 1 Omer Frenkel 2015-06-21 08:06:58 UTC
Hi Patrick,
is it possible that you check if 
https://gerrit.ovirt.org/#/c/40095/
solves the issue?
Thanks!

Comment 2 Shmuel Melamud 2015-06-21 12:19:03 UTC
This is exactly the issue that I've found and fixed with change https://gerrit.ovirt.org/#/c/40095/. vdsm 4.16.20 doesn't include this change.

I cannot reproduce the issue with the latest build of vdsm from master. Please, try it with your setup.

Comment 3 Omer Frenkel 2015-06-21 14:14:57 UTC
Thanks Shmuel,
lets try to fix for the next release.

Comment 4 Patrick Hurrelmann 2015-06-22 10:27:37 UTC
I already tried that before opening this bug.
The mentioned gerrit change https://gerrit.ovirt.org/#/c/40095/ seems to at least need https://gerrit.ovirt.org/#/c/38982/, too. Both do not cleanly apply to vdsm 4.16.20, there are more changes needed. An attempt to backport both changesets to 4.16.20 is attached (vdsm-virt-rng-device.diff). But this does not change anything. The error is still the same. VDSM-Log with the patch applies is also attached to this bz.

Comment 5 Patrick Hurrelmann 2015-06-22 10:28:33 UTC
Created attachment 1041709 [details]
backport of gerrit changes 38982 and 40095

Comment 6 Patrick Hurrelmann 2015-06-22 10:29:06 UTC
Created attachment 1041710 [details]
VDSM log of failed start with patch applied

Comment 7 Patrick Hurrelmann 2015-06-22 12:00:55 UTC
After discussion on irc this was partly solved. Starting a vm with attached virtio-rng works, when virtio-console is detached. Vm start still fails with the same error message, when virtio-console is attached. So its the combination of virtio-console and virtio-rng that renders the vm unusable. Both virtio devices are usable when added separately.

Comment 8 Omer Frenkel 2015-06-25 12:39:08 UTC
seems the issue with both rng and virtio-console happens in 3.6 as well, re-targeting.

Comment 9 Nisim Simsolo 2015-12-13 14:20:05 UTC
Verified.
rhevm-3.6.1.2-0.1.el6 
libvirt-client-1.2.17-13.el7_2.2.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.4.x86_64
vdsm-4.17.13-1.el7ev.noarch
sanlock-3.2.4-1.el7.x86_64

Verification scenario:
1. Enable cluster rng
2. Enable rng on VM and attach virtio-console.
3. Run VM, verify rng is working.
4. Power off VM, detach virtio-console.
5. Run VM.
6. Verify VM is running. Verify rng is working.

Comment 10 Sandro Bonazzola 2015-12-22 13:28:59 UTC
oVirt 3.6.0 has been released and the bz verified, moving to closed current release.


Note You need to log in before you can comment on or make changes to this bug.