Description of problem: The guest doesn't respond to the shutdown request from virsh: terminal 1: # virsh list Id Name State ---------------------------------- 1 rhel5Latest_x86_64_hvm_guest_kvm running # virsh console 1 Connected to domain rhel5Latest_x86_64_hvm_guest_kvm Escape character is ^] [root@localhost ~]# terminal 2: # virsh list Id Name State ---------------------------------- 1 rhel5Latest_x86_64_hvm_guest_kvm running # virsh shutdown 1 Domain 1 is being shutdown (after a long while) #virsh list Id Name State ---------------------------------- 1 rhel5Latest_x86_64_hvm_guest_kvm running Nothing happens in the guest console. Version-Release number of selected component (if applicable): # uname -a ; rpm -qa | egrep "kvm|libvirt" Linux ibm-x3655-04.ovirt.rhts.bos.redhat.com 2.6.18-150.el5 #1 SMP Wed May 20 20:25:53 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux etherboot-zroms-kvm-5.4.4-10.el5 kvm-83-41.el5 libvirt-0.6.3-3.el5 libvirt-0.6.3-3.el5 libvirt-python-0.6.3-3.el5 kmod-kvm-83-41.el5 How reproducible: Everytime Steps to Reproduce: 1. virsh shutdown $guest Additional info: This is breaking automation scripts. If this can be fixed quickly, it'd be great.
Please provide dmesg from inside your guest, and the libvirt XML config from the host.
Also, is acpid running inside the guest? If not, the ACPI event that KVM uses to shutdown the guest will be received by the guest kernel, but no action will be taken. Chris Lalancette
Created attachment 345908 [details] dmesg in the guest, before the shutdown command is issued. nothing else in it after it was issued.
Created attachment 345909 [details] guest xml
(In reply to comment #2) > Also, is acpid running inside the guest? If not, the ACPI event that KVM uses > to shutdown the guest will be received by the guest kernel, but no action will > be taken. > > Chris Lalancette No it wasn't running... and I can't get it to run either. [root@localhost ~]# service acpid status acpid is stopped [root@localhost ~]# service acpid start Starting acpi daemon: acpid: can't open /proc/acpi/event: Device or resource busy [FAILED] [root@localhost ~]#
Ah, OK, that will be the cause of your woes. Figuring out why acpid won't start will probably be the next step here. Chris Lalancette
Arg! I know what the problem is. In order to make hal play nicer with libvirtd, we moved haldaemon start up earlier in the boot process (BZ 500577). Because of that, haldaemon is getting a handle to /proc/acpi/events, which is preventing acpid from getting that same handle. Possible solutions: 1) Allow multiple readers of /proc/acpi/events (requires kernel changes, and I don't know what the implications of doing that is) 2) Don't have haldaemon attempt to access /proc/acpi/events. After all, in 5.3 when HAL was being started after acpid, it must have been silently failing to get a handle on /proc/acpi/events, so this wouldn't change anything. Richard, do you have any thoughts/ideas here? How is this handled in Fedora? Chris Lalancette
haldaemon automatically connects to acpid if it's running instead of reading /proc/acpi/events (at least on my up-to-date gentoo machine at home). So I think starting acpid before hal should solve it...
We could just compile HAL with --disable-acpi-proc and make it fall back to acpid always... If it's started before acpid then it just retries every few seconds until acpid is sstarted.
WRT comment #7, in Fedora acpid starts at priority 26, which is actually identical to HAL, but acpid wins because 'a' sorts before 'h'. So in Fedora, assuming acpid was turned on, hal must be talking to acpid. My concern with disabling acpi-proc support, is that it could cause other regressions for people who might have HAL turned on, but acpid turned off.
Yes, I think it's safer if we just move acpid to start a little earlier than HAL in rhel.
Ok, looks like we're agreed that acpid should be updated to start at higher priority matching its priority in Fedora, such that it starts immediately before HAL. Reassigning component...
The same problem/bug was in fedora release (7 or 8 I don't know it precisely right now). The fix is easy, just one line correction in acpid's init script. I hope that this correction fixes it and there are no other problems :-)
*** Bug 503787 has been marked as a duplicate of this bug. ***
Alex, try the following: # service haldaemon stop # service acpid restart # service haldaemon start I suspect that will make things very happy.
(In reply to comment #20) > Alex, try the following: > > # service haldaemon stop > # service acpid restart > # service haldaemon start > > I suspect that will make things very happy. Right, that didn't issue any warning or error messages.
Yay, acpid-1.0.4-9.el5 works.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, the Hardware Abstraction Layer (HAL) daemon was initialized before the ACPI daemon. Consequently, this resulted in the HAL daemon preventing the ACPI daemon from accessing /proc/acpi/event. With this update, the acpid package has been updated so the ACPI daemon now starts before the HAL daemon, which resolves this issue. (BZ#503177)
(In reply to comment #28) > Release note added. If any revisions are required, please set the > "requires_release_notes" flag to "?" and edit the "Release Notes" field > accordingly. > All revisions will be proofread by the Engineering Content Services team. > > New Contents: > Previously, the Hardware Abstraction Layer (HAL) daemon was initialized before > the ACPI daemon. Consequently, this resulted in the HAL daemon preventing the > ACPI daemon from accessing /proc/acpi/event. With this update, the acpid > package has been updated so the ACPI daemon now starts before the HAL daemon, > which resolves this issue. (BZ#503177) Wait, no, this isn't right. In 5.3, acpid started long before HAL, so things were just fine. During *development* of 5.4, we switched this around, but we found it caused problems, so we switched it back. That is, as far as customers are concerned, between 5.3 and 5.4 nothing changed with these two; acpid starts before HAL, as it always has. They now have different priorities relative to other system services, which might be worth release noting, but the note above doesn't really make sense from a customer perspective. Ryan, can we remove this release note? Chris Lalancette
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1403.html