Bug 505625

Summary: libvirtd segfault during virsh dominfo on security model
Product: [Fedora] Fedora Reporter: Stephen Smalley <sds>
Component: libvirtAssignee: Daniel Veillard <veillard>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 11CC: berrange, clalance, crobinso, dwalsh, itamar, markmc, veillard, virt-maint
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 0.6.2-13.fc11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-27 17:27:51 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 480594    
Attachments:
Description Flags
output of virsh dumpxml vm0
none
output of LIBVIRT_DEBUG=1 /usr/sbin/libvirtd none

Description Stephen Smalley 2009-06-12 11:58:26 EDT
Created attachment 347598 [details]
output of virsh dumpxml vm0

Description of problem:
libvirt segfaults upon virsh dominfo <domain> just before reporting security model information.  i686, KVM

Version-Release number of selected component (if applicable):
0.6.2-11.fc1.i586

How reproducible:
Repeatable on first try after boot and initial login.  After a manual restart of libvirtd (i.e. /sbin/service libvirtd restart) after the crash, virsh dominfo <domain> works.

Steps to Reproduce:
1. Boot system and login.
2. Run virt-manager and start guests running.
3. Run virsh dominfo <domain> and watch libvirtd seg fault.
  
Actual results:
libvirtd seg faults (traceback below).
virsh reports error:  server closed connection before reporting security info.

Expected results:
libvirtd doesn't segfault.
virsh reports Security model, DOI, and label information.

Additional info:
Traceback:
#0  0x005bf053 in strlen () from /lib/libc.so.6
#1  0x0806e911 in qemudNodeGetSecurityModel (conn=0xb4c01d18, 
    secmodel=0xb80bdf6a) at qemu_driver.c:3141
#2  0x00d8c8ae in virNodeGetSecurityModel (conn=0x0, secmodel=0xe11c60)
    at libvirt.c:4233
#3  0x0806129a in remoteDispatchNodeGetSecurityModel (server=0x91fb680, 
    client=0x91fb700, conn=0xb4c01d18, rerr=0xb80be1dc, args=0xb80be294, 
    ret=0xb80be254) at remote.c:1397
#4  0x080611d6 in remoteDispatchClientRequest (server=0x91fb680, 
    client=0x91fb700, msg=0x929f488) at remote.c:323
#5  0x080562dc in qemudWorker (data=0x921e200) at qemud.c:1467
#6  0x006f0935 in start_thread () from /lib/libpthread.so.0
#7  0x0062482e in clone () from /lib/libc.so.6

(gdb) up
#1  0x0806e911 in qemudNodeGetSecurityModel (conn=0xb4c01d18, 
    secmodel=0xb80bdf6a) at qemu_driver.c:3141
3141	    if (strlen(p) >= VIR_SECURITY_MODEL_BUFLEN-1) {
(gdb) print secmodel
$1 = (virSecurityModelPtr) 0xb80bdf6a
(gdb) print *secmodel
$2 = {model = '\0' <repeats 256 times>, doi = '\0' <repeats 256 times>}
(gdb) print p
$3 = 0x0
Comment 1 Daniel Veillard 2009-06-16 08:58:45 EDT
There is a couple missing check for sure,

    qemuDriverLock(driver);
    if (!driver->securityDriver) {
        ret = -2;
        goto cleanup;
    }

    p = driver->caps->host.secModel.model;
    if (strlen(p) >= VIR_SECURITY_MODEL_BUFLEN-1) {

we should also check for driver->caps and driver->caps->host.secModel.model
being NULL.
Comment 2 Daniel Berrange 2009-06-16 09:16:39 EDT
That does not entirely make sense. 

If 'securityDriver' is *not* NULL, then it should be impossible for driver->caps->host.secModel.model or 'doi' to be NULL. We explicitly check for a NULL 'model'/'doi' when activating the securityDriver at time libvirtd starts.

Likewise it is impossible for 'driver->caps' to be NULL, because the QEMU driver would have exited with error during startup if this had occurred.

qemudSecurityInit(struct qemud_driver *qemud_drv)

...snip....

    qemud_drv->securityDriver = security_drv;
    doi = virSecurityDriverGetDOI(security_drv);
    model = virSecurityDriverGetModel(security_drv);

    VIR_DEBUG("Initialized security driver \"%s\" with "
              "DOI \"%s\"", model, doi);

    /*
     * Add security policy host caps now that the security driver is
     * initialized.
     */
    caps = qemud_drv->caps;

    caps->host.secModel.model = strdup(model);
    if (!caps->host.secModel.model) {
        char ebuf[1024];
        VIR_ERROR(_("Failed to copy secModel model: %s"),
                  virStrerror(errno, ebuf, sizeof ebuf));
        return -1;
    }

    caps->host.secModel.doi = strdup(doi);
    if (!caps->host.secModel.doi) {
        char ebuf[1024];
        VIR_ERROR(_("Failed to copy secModel DOI: %s"),
                  virStrerror(errno, ebuf, sizeof ebuf));
        return -1;
    }
Comment 3 Daniel Veillard 2009-06-16 10:52:41 EDT
Well, I guess libvirtd was started at a point where *something* prevented the
security model from loading, probably because another service wasn't started, virSecurityDriverStartup returns -2 and you still have a NULL string there,
though the driver started...
If restarted after the full boot completed, then libvirtd security driver
initialize properly. Still I think the code should check for the NULL pointers
there.

Daniel
Comment 4 Daniel Berrange 2009-06-16 10:58:51 EDT
If the security driver  fails to startup and returns -2, then the 'driver->securityDriver' field will be NULL though, so the first check in qemudNodeGetSecurityModel should catch that
Comment 5 Stephen Smalley 2009-06-16 13:21:56 EDT
More information:  The segfault is reproducible even after manual restart of libvirtd after boot, just not every time.  So it isn't only linked to the initial startup of libvirtd.  I might try running libvirtd under valgrind if I get a chance, or someone else could try that, and then exercise virsh dominfo.
Comment 6 Stephen Smalley 2009-06-16 15:55:05 EDT
/sbin/service libvirtd stop
valgrind /usr/sbin/libvirtd
virt-manager
<Started vm0 running>
virsh dominfo vm0

yields the same traceback as the prior gdb traceback with an Invalid read of size 1 upon the strlen() call.  So no prior memory corruption detected by valgrind before the erroneous reference.
Comment 7 Daniel Berrange 2009-06-17 06:16:14 EDT
Can you run with fully verbose debugging enabled, eg

  LIBVIRT_DEBUG=1  /usr/sbin/libvirtd

and capture the output. This should let us see if anything is going wrong during driver initialization.
Comment 8 Stephen Smalley 2009-06-17 08:10:05 EDT
Created attachment 348243 [details]
output of LIBVIRT_DEBUG=1 /usr/sbin/libvirtd

This was the output from LIBVIRT_DEBUG=1 from the beginning up to a segfault upon virsh dominfo vm0.  First try at virsh dominfo vm0 was ok, but after starting virt-manager and trying again, it segfaulted.
Comment 9 Stephen Smalley 2009-06-17 08:23:05 EDT
Running it again without LIBVIRT_DEBUG, and attaching to it with gdb, and breaking in qemudNodeGetSecurityModel, the driver caps info is:

(gdb) print *((struct qemud_driver *)conn->privateData)->caps
$10 = {host = {arch = 0xb4b00468 "i686", nfeatures = 0, features = 0x0, 
    offlineMigrate = 0, liveMigrate = 0, nmigrateTrans = 0, 
    migrateTrans = 0x0, nnumaCell = 0, numaCell = 0x0, secModel = {
      model = 0x0, doi = 0x0}}, nguests = 6, guests = 0xb4b00ae0, 
  macPrefix = "RT"}
Comment 10 Daniel Berrange 2009-06-17 08:42:59 EDT
Arrggh @(*)(%#@%#@. I've discovered what's going wrong here.

Everytime the virGetCapabilities() method is run, it destroys the existing virCapsPtr object and creates it again....without any of the selinux info. virt-manager runs this method. so once virt-manager has run, subsequent calls to get the security info will give the crash you see.
Comment 12 Stephen Smalley 2009-06-25 15:11:57 EDT
Will this patch be back-ported to f11?
Comment 13 Daniel Veillard 2009-06-27 09:54:37 EDT
That applied upstream so should be in 0.6.5,

  I don't know if that will be applied to F11 which is still 0.6.2
maybe using the virt-preview repo will be better

http://fedorapeople.org/~markmc/virt-preview/f11/

I expect it to be updated with 0.6.5 soon after next week release.
Please reopen if you really want a backport instead,

  thanks 

Daniel
Comment 14 Mark McLoughlin 2009-07-03 05:00:29 EDT
okay, summary is if you do:

  $> virsh capabilities
  $> virsh dominfo foo

then libvirtd crashes

I think that's certainly worth backporting the fix to F-11 for
Comment 15 Mark McLoughlin 2009-07-03 06:09:33 EDT
Rawhide:

* Fri Jul  3 2009 Mark McLoughlin <markmc@redhat.com> - 0.6.4-4.fc12
- Fix libvirtd crash with bad capabilities data (bug #505635)

F-11:

* Fri Jul  3 2009 Mark McLoughlin <markmc@redhat.com> - 0.6.2-13.fc11
- Fix libvirtd crash with bad capabilities data (bug #505635)
Comment 16 Fedora Update System 2009-07-11 13:05:19 EDT
libvirt-0.6.2-13.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update libvirt'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-7449
Comment 17 Stephen Smalley 2009-07-27 12:56:59 EDT
libvirt-0.6.2-13.fc11 fixes the bug for me.
Comment 18 Fedora Update System 2009-07-27 17:27:40 EDT
libvirt-0.6.2-13.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.