Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 623166 - libvirtd daemon core dumps not working
libvirtd daemon core dumps not working
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.1
All Linux
low Severity high
: rc
: ---
Assigned To: Jiri Denemark
Moran Goldboim
: Reopened
: 625334 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-11 10:01 EDT by Moran Goldboim
Modified: 2011-05-19 09:20 EDT (History)
11 users (show)

See Also:
Fixed In Version: libvirt-0.8.7-2.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-05-19 09:20:01 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0596 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-05-18 13:56:36 EDT

  None (edit)
Description Moran Goldboim 2010-08-11 10:01:21 EDT
Description of problem:
[libvirt]please add an option to run libvirt/qemu with core dumps enabled

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 2 RHEL Product and Program Management 2010-08-11 10:18:04 EDT
This feature request has been proposed after Feature Freeze and we
are unable to resolve it in time for the current Red Hat Enterprise
Linux release. It has been denied for the current release and
proposed for the next Red Hat Enterprise Linux release.
Comment 3 Dan Kenigsberg 2010-08-11 10:47:05 EDT
Sorry, Moran and others, I can do it without libvirtd involvmemt.

I'll set
   DAEMON_COREFILE_LIMIT=unlimited

in /etc/sysconfig/libvirtd, and that's that.

It works fine for qemu-kvm exec'ed by libvirtd, but not for libvirtd itself: `pkill -SEGV libvirtd`  does not generate a core dump. Any clue why?
Comment 4 Daniel Berrange 2010-10-20 11:16:39 EDT
Removing feature tag, since this should already work & thus its merely a bug if it doesn't
Comment 5 Dave Allan 2010-11-08 17:17:55 EST
*** Bug 625334 has been marked as a duplicate of this bug. ***
Comment 6 Jiri Denemark 2010-12-02 09:50:06 EST
In the bz which was closed as a dup some more details can be found:

# libvirtd & kill -SEGV $!
[1] 11144
[1]+  Segmentation fault      (core dumped) libvirtd

But if given enough time, libvirt no longer dumps the core:

# libvirtd & sleep 3; kill -SEGV $!
[1] 11567
[1]+  Segmentation fault      libvirtd


The first case just kills bash since exec() didn't get a chance to be called. After a while libvirtd process is executed and it doesn't generate cores. However, /proc/PID/limits still says that core file size is unlimited.

The reporters are also using /var/log/core/core.%p.%t.dump core patterns in /proc/sys/kernel/core_pattern to change the location where core dumps are stored.

The machine is RHEL-6.0, libvirt-0.8.1-28.el6.x86_64, kernel-2.6.32-72.el6.x86_64, SELinux was in permissive mode.

Still I couldn't reproduce it locally.
Comment 7 Daniel Berrange 2010-12-02 10:41:37 EST
I can't reproduce this problem on any machine except for those in TLV showing the problem, which are all running VDSM. For any RHEL or Fedora machine of any vintage, libvirtd always generates core dumps as expected when    DAEMON_COREFILE_LIMIT=unlimited is set in sysconfig, or ulimited -c unlimited.

Request that the reporter tries reproducing this problem on a machine which has *never* had VSDM installed. Assuming it dumps core correctly, then install VDSM & reboot & try and reproduce it again, to see if the problem now occurs.
Comment 8 Haim 2010-12-08 16:06:14 EST
well, results are not conclusive (tried the first part), maybe i miss some configuration.

1) clean machine (RHEL6) - install libvirt 
2) set 'ulimit -c unlimited' 
3) service libvirtd start 
4) kill -SEGV `pgrep libvirt` 
5) result: no cores!
6) rm -rf /var/run/libvirtd.pid 
7) /usr/sbin/libvirtd --daemon & (run daemon from command line)
8) kill SEGV `pgrep libvirt`
9) result: core dump to current dir

Daniel - where should i configure the 'DAEMON_COREFILE_LIMIT=unlimited' ? 
tried to put it under /etc/sysconfig/init but libvirt refused to go up. 

please elaborate.
Comment 9 Dan Kenigsberg 2010-12-08 16:28:34 EST
(In reply to comment #8)
> Daniel - where should i configure the 'DAEMON_COREFILE_LIMIT=unlimited' ? 
> tried to put it under /etc/sysconfig/init but libvirt refused to go up. 

odd. but please try putting it in
    /etc/sysconfig/libvirtd

Where are you looking for daemon cores? If /proc/sys/kernel/core_pattern is unset it should be in libvirt's cwd (which is /).
Comment 10 Dave Allan 2010-12-09 16:45:14 EST
Haim, did Cole's suggestions on IRC yesterday solve it for you?
Comment 12 Jiri Denemark 2010-12-13 04:45:48 EST
(In reply to comment #11)
> (In reply to comment #9)
> > Where are you looking for daemon cores? If /proc/sys/kernel/core_pattern is
> > unset it should be in libvirt's cwd (which is /).
> 
> Dan - tried to add it to /etc/sysconfig/libvirtd with no luck. 
> no cores at '/'. 

What about the directory where you started libvirtd from? No cores even there?
Comment 17 Alan Pevec 2011-01-06 18:11:10 EST
core dumps with following commented out in vdsm customized libvirtd.conf

#unix_sock_group="kvm" # by vdsm

I have no idea why, but I hope libvirt developers will have a clue now.
Comment 18 Alan Pevec 2011-01-06 18:23:53 EST
Just to confirm, setting unix_sock_group to any random group e.g. on my laptop:
unix_sock_group = "apevec"

prevents libvirtd from producing core dumps on sigsegv.
Comment 19 Jiri Denemark 2011-01-07 03:33:32 EST
WTH, you are right, I reproduced it even on my systems. Thanks a lot Alan for chasing that down.
Comment 20 Jiri Denemark 2011-01-07 05:12:25 EST
As a temporary workaround, you can do

# sysctl fs.suid_dumpable=2

or

# sysctl fs.suid_dumpable=1

The first case is more secure and can work only if custom core_pattern which prevents overwriting existing files is set because fs.suid_dumpable == 2 does not overwrite existing files. The second option is dangerous.

I'm working on a proper fix in the meantime...
Comment 21 Daniel Berrange 2011-01-07 07:13:09 EST
libvirtd does

  old = getgid()
  setgid(unix_sock_gid);
  ...create unix sock...
  setgid(old);


So we should *not* in fact be considered as as a setuid/setgid process. The kernel, however, does track the gid changes. It simply looks for any change in GID/UID, and thereafter refuses dump, even if you change back to your original UID/GID:

        /* dumpability changes */
        if (old->euid != new->euid ||
            old->egid != new->egid ||
            old->fsuid != new->fsuid ||
            old->fsgid != new->fsgid ||
            !cap_issubset(new->cap_permitted, old->cap_permitted)) {
                if (task->mm)
                        set_dumpable(task->mm, suid_dumpable);
                task->pdeath_signal = 0;
                smp_wmb();
        }


So I'd argue this is a kernel bug, but I doubt we'll have any luck getting that behaviour changed. The only option I can think of is to use  fchgrp(sockfd) on the socket FD, instead of setgid before/after the socket calls.
Comment 22 Jiri Denemark 2011-01-10 05:42:36 EST
This is fixed upstream by v0.8.7-19-g5e5acbc:

commit 5e5acbc8d67e1ac074320176bbc3682b9ba934c0
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Fri Jan 7 12:34:12 2011 +0100

    daemon: Fix core dumps if unix_sock_group is set
    
    Setting unix_sock_group to something else than default "root" in
    /etc/libvirt/libvirtd.conf prevents system libvirtd from dumping core on
    crash. This is because we used setgid(unix_sock_group) before binding to
    /var/run/libvirt/libvirt-sock* and setgid() back to original group.
    However, if a process changes its effective or filesystem group ID, it
    will be forbidden from leaving core dumps unless fs.suid_dumpable sysctl
    is set to something else then 0 (and it is 0 by default).
    
    Changing socket's group ownership after bind works better. And we can do
    so without introducing a race condition since we loosen access rights by
    changing the group from root to something else.
Comment 23 Jiri Denemark 2011-01-13 16:36:43 EST
Patch sent to rhvirt-patches:
http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-January/msg00657.html
Comment 25 zhanghaiyan 2011-01-18 02:08:06 EST
Verified this bug pass with libvirt-0.8.7-2.el6.x86_64
- libvirt-0.8.7-2.el6.x86_64
- qemu-kvm-0.12.1.2-2.129.el6.x86_64
- 2.6.32-94.el6.x86_64

1. # echo DAEMON_COREFILE_LIMIT=unlimited >>  /etc/sysconfig/libvirtd
2. Make default unix sock group settings
# This is restricted to 'root' by default.
#unix_sock_group = "libvirt"
3. # cat /proc/sys/kernel/core_pattern 
|/usr/libexec/abrt-hook-ccpp /var/spool/abrt %p %s %u %c
4. # service libvirtd restart
5. # pkill -SEGV libvirtd
6. # ls /core*
/core.22317
7. Change unix sock group to 'kvm'
# This is restricted to 'root' by default.
#unix_sock_group = "libvirt"
unix_sock_group = "kvm"
8. # service libvirtd restart
9. # pkill -SEGV libvirtd
# ls /core*
/core.22317  /core.22814

Also reproduced this bug with libvirt-0.8.7-1.el6.x86_64
For step9, cannot get core dump file
Comment 28 errata-xmlrpc 2011-05-19 09:20:01 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0596.html

Note You need to log in before you can comment on or make changes to this bug.