Bug 771570
| Summary: | Restart libvirtd will get error and fail to reconnect domains on nfs storage | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Wayne Sun <gsun> |
| Component: | libvirt | Assignee: | Eric Blake <eblake> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.3 | CC: | acathrow, dallan, mzhan, rwu, veillard, weizhan, ydu |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-0.9.9-1.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-06-20 06:43:26 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 746666 | ||
| Bug Blocks: | |||
|
Description
Wayne Sun
2012-01-04 07:27:08 UTC
Is the bug present on 0.9.4-23.el6 and libvirt-0.9.4-23.el6_2.1 ? To be 100% clear can you stop and undefine all guests on that machine, then install 0.9.4-23.el6 on the machine, then redefine the guests and restart libvirtd, is the problem still happening. I'm trying to make sure that the parsing error that you are seeing are not coming from the fact that the domains were defined with an incompatible version of libvirt. Daniel (In reply to comment #2) > Is the bug present on 0.9.4-23.el6 and libvirt-0.9.4-23.el6_2.1 ? > > To be 100% clear can you stop and undefine all guests on that > machine, then install 0.9.4-23.el6 on the machine, > then redefine the guests and restart libvirtd, is the problem > still happening. > I'm trying to make sure that the parsing error that you are seeing > are not coming from the fact that the domains were defined with an > incompatible version of libvirt. > > Daniel With domain fresh define after install libvirt, then restart libvirtd: libvirt-0.9.4-23: fine 0.9.4-23.el6_2.1: fine 0.9.4-23.el6_2.2: 17:01:10.086: 30314: error : virSecurityLabelDefParseXMLHelper:2113 : XML error: security label is missing domain crash libvirt-0.9.4-23.el6_2.3 17:04:43.813: 31250: error : virSecurityLabelDefParseXMLHelper:2113 : XML error: security label is missing domain crash libvirt-0.9.9-0rc1: 2012-01-04 08:45:26.266+0000: 26784: error : virSecurityLabelDefParseXMLHelper:2593 : XML error: security label is missing The running domain crash. So this is not only happen on this z-stream build, it begin from libvirt-0.9.9-0rc1. What does "domain crash" mean ? Is the qemu-kvm process associated to the domain killed and doesn't run anymore, or just that the domain doesn't show up in the list produced by libvirt/virsh ? thanks, Daniel (In reply to comment #4) > What does "domain crash" mean ? Is the qemu-kvm process associated to > the domain killed and doesn't run anymore, or just that the domain > doesn't show up in the list produced by libvirt/virsh ? > > thanks, > > Daniel 1. check domain status # virsh list Id Name State ---------------------------------- 2 apitest1 running # ps aux|grep qemu-kvm|grep -v grep qemu 1703 1.0 0.2 595724 28080 ? Sl 17:21 0:00 /usr/libexec/qemu-kvm -S -M rhel6.2.0 -no-kvm -m 215 -smp 1,sockets=1,cores=1,threads=1 -name apitest1 -uuid ce64f9ed-57e5-9d0d-7363-e04f4f9c1094 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/apitest1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -no-acpi -drive file=/var/lib/libvirt/images/apitest1,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -usb -spice port=5900,addr=0.0.0.0,disable-ticketing -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 2.restart libvirtd # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] 3.recheck # virsh list Id Name State ---------------------------------- # ps aux|grep qemu-kvm|grep -v grep qemu 1703 0.9 0.2 595724 28080 ? Sl 17:21 0:00 /usr/libexec/qemu-kvm -S -M rhel6.2.0 -no-kvm -m 215 -smp 1,sockets=1,cores=1,threads=1 -name apitest1 -uuid ce64f9ed-57e5-9d0d-7363-e04f4f9c1094 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/apitest1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -no-acpi -drive file=/var/lib/libvirt/images/apitest1,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -usb -spice port=5900,addr=0.0.0.0,disable-ticketing -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 Confirmed on libvirt 0.9.4-23.el6_2.2, 0.9.4-23.el6_2.3 and 0.9.9-0rc1. So, no output from virsh list, but the qemu-kvm process still exist. Can you attach the /var/run/libvirt/qemu/$dom.xml file of the domain that is failing to restore? I suspect that my recent changes to address bug 746666 have an error where the XML being generated on output is not quite being reparsed correctly back on input when libvirtd restarts, but I need to see the XML in question to make sure. Meanwhile, I'm still investigating, and hope to have a patch soon. (In reply to comment #0) > Description of problem: > There are defined and running domains on host, the domain images are on > nfs share. After restart libvirtd, there will get xml parse errors and > crash the running domains. Selinux is enforcing and virt_use_nfs is on. > # sestatus > SELinux status: enabled > SELinuxfs mount: /selinux > Current mode: enforcing > Mode from config file: enforcing > Policy version: 24 > Policy from config file: targeted > # getsebool virt_use_nfs > virt_use_nfs --> on > > > downgrade or update libvirt also have this problem. > I've tried downgrade from 0.9.9.-0rc1 to 0.9.4-23.el6_2.2.x86_64 and > update from 0.9.4-23.el6_2.2.x86_64 to 0.9.4-23.el6_2.3, guests will crash. Downgrading is not generally a supported operation, although if we can trivially support it, we should. That is, once a guest XML has been written with a newer libvirt, there is no guarantee that an older libvirt will parse it correctly. I'm more worried about the upgrade path - if a guest was started during an older libvirt, then libvirt is upgraded, the restarted libvirtd should not have any problems reading that older xml. Upstream patch proposed: https://www.redhat.com/archives/libvir-list/2012-January/msg00148.html commit 302fe95ffa1bc5f1c61c0beb31a1adfbc38c668e
Author: Eric Blake <eblake>
Date: Wed Jan 4 16:01:24 2012 -0700
seclabel: fix regression in libvirtd restart
Commit b434329 has a logic bug: seclabel overrides don't set
def->type, but the default value is 0 (aka static). Restarting
libvirtd would thus reject the XML for any domain with an
override of <seclabel relabel='no'/> (which happens quite
easily if a disk image lives on NFS), with a message:
2012-01-04 22:29:40.949+0000: 6769: error : virSecurityLabelDefParseXMLHelper:2593 : XML error: security label is missing
Fix the logic to never read from an override's def->type, and
to allow a missing <label> subelement when relabel is no. There's
a lot of stupid double-negatives in the code (!norelabel) because
of the way that we want the zero-initialized defaults to behave.
* src/conf/domain_conf.c (virSecurityLabelDefParseXMLHelper): Use
type field from correct location.
Test this bug on libvirt-0.9.9-1.el6.x86_64. Follow the reproduce steps of this bug, after restart libvirtd, guest can list by virsh list, which still in running status. So move bug to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0748.html |