667309 – [RHEVM] Running Vms for the second time always fails

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 667309 - [RHEVM] Running Vms for the second time always fails

Summary: [RHEVM] Running Vms for the second time always fails

Keywords:
Status:	CLOSED DUPLICATE of bug 624252
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	6.0
Hardware:	Unspecified
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Denemark
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-01-05 07:58 UTC by Idan Mansano
Modified:	2011-10-14 08:22 UTC (History)
CC List:	16 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-01-24 21:06:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
libvirt log (150.22 KB, text/plain) 2011-01-06 17:40 UTC, Haim	no flags	Details
View All

Description Idan Mansano 2011-01-05 07:58:37 UTC

VDSM Version: vdsm-4.9-39.el6.
Libvirt Version: libvirt-0.8.1-27.el6_0.2

We encountered the following issue:
1. We create and run a new VM
2. we destroy that VM (Stop it)
3. we run the VM again
After the second run, libvirt sends a destroyed event which means the run failed.  
We believe this is a runaway event referring to step 2 (the VM is actually running fine), so as far as VDSM is concerned the second run VM failed.

Comment 2 Idan Mansano 2011-01-05 08:54:38 UTC

Important info:
1. It seems that this issue had already existed in the previous RHEVM builds.
(for example: ic74)
2. The issue happans only if there is no running VMs at all in the cluster.
In case there is at least one running vm in the cluster, everyting works fine.

Moving this bus to urgent state, according to my manager.

Comment 3 Daniel Berrangé 2011-01-05 11:18:34 UTC

This could be a similar problem to that described in bug 666158

Comment 5 Daniel Berrangé 2011-01-06 15:02:03 UTC

Please provide the full XML for the guest being created, the logfile in /var/log/libvirt/qemu/$GUEST.log, and finally the exact error message received from libvirt when the 2nd virDomainCreate attempt fails.

Comment 6 Daniel Berrangé 2011-01-06 15:03:37 UTC

If this is truely a regression, then please also provide info on what version of libvirt you need to downgrade to before it works correctly again. eg does the previous build 0.8.1-27.el6_0.1 work ?

Comment 7 Haim 2011-01-06 17:40:09 UTC

Created attachment 472103 [details]
libvirt log

(In reply to comment #5)
> Please provide the full XML for the guest being created, the logfile in
> /var/log/libvirt/qemu/$GUEST.log, and finally the exact error message received
> from libvirt when the 2nd virDomainCreate attempt fails.

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M rhel6.0.0 -cpu Conroe -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name rhel6-nfs-1 -uuid 796d95ea-1640-4aea-9f12-0d9ea0440ee3 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/rhel6-nfs-1.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=2011-01-06T17:31:36 -boot nc -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4 -drive file=/rhev/data-center/cf4e325a-482b-4e20-8b1d-6b1acd5c7dc4/78cbee4a-f021-47d1-9f90-c6ef34c2935d/images/7c571638-4826-46ee-8a9b-9d4232154ace/f5d32eff-5adc-4787-a47d-3cccb98b8ccb,if=none,id=drive-virtio-disk0,boot=on,format=raw,serial=ee-8a9b-9d4232154ace,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=25,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:1a:4a:16:87:30,bus=pci.0,addr=0x3 -chardev socket,id=channel0,path=/var/lib/libvirt/qemu/channels/rhel6-nfs-1.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=0,chardev=channel0,name=com.redhat.rhevm.vdsm -usb -device usb-tablet,id=input0 -vnc 0:0,password -k en-us -vga cirrus
2011-01-06 19:31:38.982: shutting down

error is not clear from the logs; found a warning with permission error at the end, though i'm not sure it's related.
attached please find full libvirt log with 2 virDomainCreate attempts.

19:31:20.243: 30734: warning : virDomainDiskDefForeachPath:8298 : Ignoring open failure on /rhev/data-center/cf4e325a-482b-4e20-8b1d-6b1acd5c7dc4/78cbee4a-f021-47d1-9f90-c6ef34c2935d/images/7c571638-4826-46ee-8a9b-9d4232154ace/f5d32eff-5adc-4787-a47d-3cccb98b8ccb: Permission denied

Comment 8 Daniel Berrangé 2011-01-06 17:49:49 UTC

> 3. we run the VM again
> After the second run, libvirt sends a destroyed event which means the run
> failed.  

I don't see any evidence in the logs that the 2nd VM run failed. Everything indicates it successfully started QEMU, and then QEMU shutdown. Did you actually get an *error code* when starting the guest, or are you simply inferring failure from the fact that you got an async event ?

Comment 9 Dan Kenigsberg 2011-01-06 20:45:09 UTC

Daniel, the VM fails to start - as perceived by rhevm. Libvirt creates the domain, but immediately *after* creation, vdsm receives VIR_DOMAIN_EVENT_STOPPED VIR_DOMAIN_EVENT_STOPPED_CRASHED. Could this be related to bug 624252?

(I suddenly think there might be an inherent race in the way vdsm handles libvirt's events. We may need to add a barrier that makes sure that all events have been processed before issuing a critical libvirt API. Do you have a suggestion?)

Comment 10 Jiri Denemark 2011-01-18 12:58:45 UTC

I think you are right that this could be related to bug 624252. Quoting from Cole's description:

The events actually weren't being lost, it's just that the event loop didn't
think there was anything that needed to be dispatched. So all those 'lost
events' should actually get re-triggered if you manually kick the loop by
generating a new event (like creating a new guest).

That is, the stopped event might have been waiting for dispatch until a new domain was started.

We will need to check this again after that bug is fixed.

Comment 11 Jiri Denemark 2011-01-24 12:47:50 UTC

Could you check whether this issue still exists with libvirt-0.8.7-3.el6 package, which fixes bug 624252?

Comment 12 David Naori 2011-01-24 18:37:14 UTC

Checked on libvirt-0.8.7-3 - seems like the issue solved

Comment 13 Jiri Denemark 2011-01-24 21:06:04 UTC

Great, thanks for the testing. I'll close this as a duplicate.

*** This bug has been marked as a duplicate of bug 624252 ***

Note You need to log in before you can comment on or make changes to this bug.