Bug 1233825

Summary: VM cannot start after adding a virtio-rng random number device (oVirt 3.5.3)
Product: [oVirt] ovirt-engine Reporter: Patrick Hurrelmann <redhat>
Component: GeneralAssignee: Shmuel Melamud <smelamud>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: high Docs Contact:
Priority: high    
Version: ---CC: bugs, ecohen, fromani, gklein, lsurette, michal.skrivanek, nsimsolo, rbalakri, redhat, yeylon
Target Milestone: ovirt-3.6.0-rcFlags: rule-engine: ovirt-3.6.0+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: 3.6.0-4_alpha3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-22 13:28:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
VDSM log of failed start
none
backport of gerrit changes 38982 and 40095
none
VDSM log of failed start with patch applied none

Description Patrick Hurrelmann 2015-06-19 14:05:33 UTC
Created attachment 1040971 [details]
VDSM log of failed start

Description of problem:
When a random number generator (virtio-rng) is added to an existing VM it won't start any longer. Engine webui shows:

2015-Jun-19, 15:39   Failed to run VM example-vm on Host vhost.
2015-Jun-19, 15:39   VM example-vm is down with error. Exit message: 'device'.
2015-Jun-19, 15:39   VM example-vm was started by user@FreeIPA (Host: vhost).

Version-Release number of selected component (if applicable):
On engine:
otopi.noarch                                         1.3.2-1.el7.centos
otopi-java.noarch                                    1.3.2-1.el7.centos
ovirt-engine.noarch                                  3.5.3.1-1.el7.centos
ovirt-engine-backend.noarch                          3.5.3.1-1.el7.centos
ovirt-engine-cli.noarch                              3.5.0.6-0.1.20141107.gitcf7f1a1.el7.centos
ovirt-engine-dbscripts.noarch                        3.5.3.1-1.el7.centos
ovirt-engine-extension-aaa-ldap.noarch               1.0.2-1.el7
ovirt-engine-extension-aaa-misc.noarch               1.0.0-1.el7
ovirt-engine-extensions-api-impl.noarch              3.5.3.1-1.el7.centos
ovirt-engine-jboss-as.x86_64                         7.1.1-1.el7
ovirt-engine-lib.noarch                              3.5.3.1-1.el7.centos
ovirt-engine-restapi.noarch                          3.5.3.1-1.el7.centos
ovirt-engine-sdk-python.noarch                       3.5.2.1-1.el7.centos
ovirt-engine-setup.noarch                            3.5.3.1-1.el7.centos
ovirt-engine-setup-base.noarch                       3.5.3.1-1.el7.centos
ovirt-engine-setup-plugin-ovirt-engine.noarch        3.5.3.1-1.el7.centos
ovirt-engine-setup-plugin-ovirt-engine-common.noarch 3.5.3.1-1.el7.centos
ovirt-engine-setup-plugin-websocket-proxy.noarch     3.5.3.1-1.el7.centos
ovirt-engine-tools.noarch                            3.5.3.1-1.el7.centos
ovirt-engine-userportal.noarch                       3.5.3.1-1.el7.centos
ovirt-engine-webadmin-portal.noarch                  3.5.3.1-1.el7.centos
ovirt-engine-websocket-proxy.noarch                  3.5.3.1-1.el7.centos
ovirt-guest-agent-common.noarch                      1.0.10-2.el7
ovirt-host-deploy.noarch                             1.3.1-1.el7
ovirt-host-deploy-java.noarch                        1.3.1-1.el7
ovirt-image-uploader.noarch                          3.5.1-1.el7.centos
ovirt-iso-uploader.noarch                            3.5.2-1.el7.centos

On host:
vdsm.x86_64                        4.16.20-0.el7.centos
vdsm-cli.noarch                    4.16.20-0.el7.centos
vdsm-jsonrpc.noarch                4.16.20-0.el7.centos
vdsm-python.noarch                 4.16.20-0.el7.centos
vdsm-python-zombiereaper.noarch    4.16.20-0.el7.centos
vdsm-xmlrpc.noarch                 4.16.20-0.el7.centos
vdsm-yajsonrpc.noarch              4.16.20-0.el7.centos

How reproducible:
Every time

Steps to Reproduce:
1. Enable Random Number Generator on cluster level
2. Enable Random Number Generator on VM
3. Start VM

Actual results:
VM does not start. It fails with error "Exit message 'device'"

Expected results:
VM is started and has a virtio-rng device attached.


Additional info:
Logfile from VM startup ist attached (vdsm.log).

According to discussion in #vdsm the problem is that the device specification misses the 'device' attribute, and VDSM depends on it. There are quite some places on which VDSM checks for this attribute, so it is hard and counterproductive to workaround it in VDSM

Comment 1 Omer Frenkel 2015-06-21 08:06:58 UTC
Hi Patrick,
is it possible that you check if 
https://gerrit.ovirt.org/#/c/40095/
solves the issue?
Thanks!

Comment 2 Shmuel Melamud 2015-06-21 12:19:03 UTC
This is exactly the issue that I've found and fixed with change https://gerrit.ovirt.org/#/c/40095/. vdsm 4.16.20 doesn't include this change.

I cannot reproduce the issue with the latest build of vdsm from master. Please, try it with your setup.

Comment 3 Omer Frenkel 2015-06-21 14:14:57 UTC
Thanks Shmuel,
lets try to fix for the next release.

Comment 4 Patrick Hurrelmann 2015-06-22 10:27:37 UTC
I already tried that before opening this bug.
The mentioned gerrit change https://gerrit.ovirt.org/#/c/40095/ seems to at least need https://gerrit.ovirt.org/#/c/38982/, too. Both do not cleanly apply to vdsm 4.16.20, there are more changes needed. An attempt to backport both changesets to 4.16.20 is attached (vdsm-virt-rng-device.diff). But this does not change anything. The error is still the same. VDSM-Log with the patch applies is also attached to this bz.

Comment 5 Patrick Hurrelmann 2015-06-22 10:28:33 UTC
Created attachment 1041709 [details]
backport of gerrit changes 38982 and 40095

Comment 6 Patrick Hurrelmann 2015-06-22 10:29:06 UTC
Created attachment 1041710 [details]
VDSM log of failed start with patch applied

Comment 7 Patrick Hurrelmann 2015-06-22 12:00:55 UTC
After discussion on irc this was partly solved. Starting a vm with attached virtio-rng works, when virtio-console is detached. Vm start still fails with the same error message, when virtio-console is attached. So its the combination of virtio-console and virtio-rng that renders the vm unusable. Both virtio devices are usable when added separately.

Comment 8 Omer Frenkel 2015-06-25 12:39:08 UTC
seems the issue with both rng and virtio-console happens in 3.6 as well, re-targeting.

Comment 9 Nisim Simsolo 2015-12-13 14:20:05 UTC
Verified.
rhevm-3.6.1.2-0.1.el6 
libvirt-client-1.2.17-13.el7_2.2.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.4.x86_64
vdsm-4.17.13-1.el7ev.noarch
sanlock-3.2.4-1.el7.x86_64

Verification scenario:
1. Enable cluster rng
2. Enable rng on VM and attach virtio-console.
3. Run VM, verify rng is working.
4. Power off VM, detach virtio-console.
5. Run VM.
6. Verify VM is running. Verify rng is working.

Comment 10 Sandro Bonazzola 2015-12-22 13:28:59 UTC
oVirt 3.6.0 has been released and the bz verified, moving to closed current release.