Created attachment 347598 [details] output of virsh dumpxml vm0 Description of problem: libvirt segfaults upon virsh dominfo <domain> just before reporting security model information. i686, KVM Version-Release number of selected component (if applicable): 0.6.2-11.fc1.i586 How reproducible: Repeatable on first try after boot and initial login. After a manual restart of libvirtd (i.e. /sbin/service libvirtd restart) after the crash, virsh dominfo <domain> works. Steps to Reproduce: 1. Boot system and login. 2. Run virt-manager and start guests running. 3. Run virsh dominfo <domain> and watch libvirtd seg fault. Actual results: libvirtd seg faults (traceback below). virsh reports error: server closed connection before reporting security info. Expected results: libvirtd doesn't segfault. virsh reports Security model, DOI, and label information. Additional info: Traceback: #0 0x005bf053 in strlen () from /lib/libc.so.6 #1 0x0806e911 in qemudNodeGetSecurityModel (conn=0xb4c01d18, secmodel=0xb80bdf6a) at qemu_driver.c:3141 #2 0x00d8c8ae in virNodeGetSecurityModel (conn=0x0, secmodel=0xe11c60) at libvirt.c:4233 #3 0x0806129a in remoteDispatchNodeGetSecurityModel (server=0x91fb680, client=0x91fb700, conn=0xb4c01d18, rerr=0xb80be1dc, args=0xb80be294, ret=0xb80be254) at remote.c:1397 #4 0x080611d6 in remoteDispatchClientRequest (server=0x91fb680, client=0x91fb700, msg=0x929f488) at remote.c:323 #5 0x080562dc in qemudWorker (data=0x921e200) at qemud.c:1467 #6 0x006f0935 in start_thread () from /lib/libpthread.so.0 #7 0x0062482e in clone () from /lib/libc.so.6 (gdb) up #1 0x0806e911 in qemudNodeGetSecurityModel (conn=0xb4c01d18, secmodel=0xb80bdf6a) at qemu_driver.c:3141 3141 if (strlen(p) >= VIR_SECURITY_MODEL_BUFLEN-1) { (gdb) print secmodel $1 = (virSecurityModelPtr) 0xb80bdf6a (gdb) print *secmodel $2 = {model = '\0' <repeats 256 times>, doi = '\0' <repeats 256 times>} (gdb) print p $3 = 0x0
There is a couple missing check for sure, qemuDriverLock(driver); if (!driver->securityDriver) { ret = -2; goto cleanup; } p = driver->caps->host.secModel.model; if (strlen(p) >= VIR_SECURITY_MODEL_BUFLEN-1) { we should also check for driver->caps and driver->caps->host.secModel.model being NULL.
That does not entirely make sense. If 'securityDriver' is *not* NULL, then it should be impossible for driver->caps->host.secModel.model or 'doi' to be NULL. We explicitly check for a NULL 'model'/'doi' when activating the securityDriver at time libvirtd starts. Likewise it is impossible for 'driver->caps' to be NULL, because the QEMU driver would have exited with error during startup if this had occurred. qemudSecurityInit(struct qemud_driver *qemud_drv) ...snip.... qemud_drv->securityDriver = security_drv; doi = virSecurityDriverGetDOI(security_drv); model = virSecurityDriverGetModel(security_drv); VIR_DEBUG("Initialized security driver \"%s\" with " "DOI \"%s\"", model, doi); /* * Add security policy host caps now that the security driver is * initialized. */ caps = qemud_drv->caps; caps->host.secModel.model = strdup(model); if (!caps->host.secModel.model) { char ebuf[1024]; VIR_ERROR(_("Failed to copy secModel model: %s"), virStrerror(errno, ebuf, sizeof ebuf)); return -1; } caps->host.secModel.doi = strdup(doi); if (!caps->host.secModel.doi) { char ebuf[1024]; VIR_ERROR(_("Failed to copy secModel DOI: %s"), virStrerror(errno, ebuf, sizeof ebuf)); return -1; }
Well, I guess libvirtd was started at a point where *something* prevented the security model from loading, probably because another service wasn't started, virSecurityDriverStartup returns -2 and you still have a NULL string there, though the driver started... If restarted after the full boot completed, then libvirtd security driver initialize properly. Still I think the code should check for the NULL pointers there. Daniel
If the security driver fails to startup and returns -2, then the 'driver->securityDriver' field will be NULL though, so the first check in qemudNodeGetSecurityModel should catch that
More information: The segfault is reproducible even after manual restart of libvirtd after boot, just not every time. So it isn't only linked to the initial startup of libvirtd. I might try running libvirtd under valgrind if I get a chance, or someone else could try that, and then exercise virsh dominfo.
/sbin/service libvirtd stop valgrind /usr/sbin/libvirtd virt-manager <Started vm0 running> virsh dominfo vm0 yields the same traceback as the prior gdb traceback with an Invalid read of size 1 upon the strlen() call. So no prior memory corruption detected by valgrind before the erroneous reference.
Can you run with fully verbose debugging enabled, eg LIBVIRT_DEBUG=1 /usr/sbin/libvirtd and capture the output. This should let us see if anything is going wrong during driver initialization.
Created attachment 348243 [details] output of LIBVIRT_DEBUG=1 /usr/sbin/libvirtd This was the output from LIBVIRT_DEBUG=1 from the beginning up to a segfault upon virsh dominfo vm0. First try at virsh dominfo vm0 was ok, but after starting virt-manager and trying again, it segfaulted.
Running it again without LIBVIRT_DEBUG, and attaching to it with gdb, and breaking in qemudNodeGetSecurityModel, the driver caps info is: (gdb) print *((struct qemud_driver *)conn->privateData)->caps $10 = {host = {arch = 0xb4b00468 "i686", nfeatures = 0, features = 0x0, offlineMigrate = 0, liveMigrate = 0, nmigrateTrans = 0, migrateTrans = 0x0, nnumaCell = 0, numaCell = 0x0, secModel = { model = 0x0, doi = 0x0}}, nguests = 6, guests = 0xb4b00ae0, macPrefix = "RT"}
Arrggh @(*)(%#@%#@. I've discovered what's going wrong here. Everytime the virGetCapabilities() method is run, it destroys the existing virCapsPtr object and creates it again....without any of the selinux info. virt-manager runs this method. so once virt-manager has run, subsequent calls to get the security info will give the crash you see.
http://www.redhat.com/archives/libvir-list/2009-June/msg00280.html
Will this patch be back-ported to f11?
That applied upstream so should be in 0.6.5, I don't know if that will be applied to F11 which is still 0.6.2 maybe using the virt-preview repo will be better http://fedorapeople.org/~markmc/virt-preview/f11/ I expect it to be updated with 0.6.5 soon after next week release. Please reopen if you really want a backport instead, thanks Daniel
okay, summary is if you do: $> virsh capabilities $> virsh dominfo foo then libvirtd crashes I think that's certainly worth backporting the fix to F-11 for
Rawhide: * Fri Jul 3 2009 Mark McLoughlin <markmc> - 0.6.4-4.fc12 - Fix libvirtd crash with bad capabilities data (bug #505635) F-11: * Fri Jul 3 2009 Mark McLoughlin <markmc> - 0.6.2-13.fc11 - Fix libvirtd crash with bad capabilities data (bug #505635)
libvirt-0.6.2-13.fc11 has been pushed to the Fedora 11 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update libvirt'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-7449
libvirt-0.6.2-13.fc11 fixes the bug for me.
libvirt-0.6.2-13.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.