Red Hat Bugzilla – Bug 964359
virSetUIDGID can deadlock due to unsafe use of getpwuid_r
Last modified: 2013-11-21 04:01:16 EST
Cloning to RHEL. +++ This bug was initially created as a clone of Bug #964358 +++ Description of problem: I hit an (admittedly rare) deadlock today where a multithreaded parent spawned off a child, then the child called getpwuid_r. POSIX says that getpwuid_r is NOT async-signal-safe, and therefore unsafe to be used between fork and exec, precisely because of the possibility of the deadlock I hit. Libvirt needs to rewrite virSetUIDGID to take the additional information learned by getpwuid_r as a parameter learned in advance, prior to the fork, rather than learning it inline after the fork. Version-Release number of selected component (if applicable): Deadlock hit me on latest libvirt.git, but git blame says that virSetUIDGID has been calling getpwuid_r since at least 2011, so I'm raising it against Fedora 18. How reproducible: Extremely rare Steps to Reproduce: 1. Start a VM, and happen to hit the small window where the parent process holds the getpwuid_r mutex in one thread while another thread tries to fork and exec. While I happened to get (extremely) (un-)lucky to hit it in real life, it will probably be a lot easier if you use gdb to help. 2. 3. Actual results: Deadlock. Parent process hung with the following backtrace: (gdb) t a a bt Thread 2 (Thread 0x7fd56bbf2700 (LWP 19367)): #0 0x00007fd5761ec54d in read () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007fd578aace46 in saferead (fd=20, buf=0x7fd56bbf1117, count=1) at util/virfile.c:926 #2 0x00007fd578a9e97a in virCommandHandshakeWait (cmd=0x7fd55c231f10) at util/vircommand.c:2463 #3 0x00007fd56326ff40 in qemuProcessStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1, stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:3729 #4 0x00007fd5632af536 in qemuDomainObjStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, flags=0) at qemu/qemu_driver.c:5453 #5 0x00007fd5632af796 in qemuDomainCreateWithFlags (dom=0x7fd55c307d30, flags=0) at qemu/qemu_driver.c:5502 #6 0x00007fd5632af829 in qemuDomainCreate (dom=0x7fd55c307d30) at qemu/qemu_driver.c:5520 #7 0x00007fd578b91569 in virDomainCreate (domain=0x7fd55c307d30) at libvirt.c:8462 #8 0x00007fd5795ee7aa in remoteDispatchDomainCreate (server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0, rerr=0x7fd56bbf1aa0, args=0x7fd55c307d70) at remote_dispatch.h:2910 ---Type <return> to continue, or q <return> to quit--- #9 0x00007fd5795ee6a0 in remoteDispatchDomainCreateHelper ( server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0, rerr=0x7fd56bbf1aa0, args=0x7fd55c307d70, ret=0x7fd55c305430) at remote_dispatch.h:2888 #10 0x00007fd578bfdbfa in virNetServerProgramDispatchCall ( prog=0x7fd579d91cf0, server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0) at rpc/virnetserverprogram.c:439 #11 0x00007fd578bfd771 in virNetServerProgramDispatch (prog=0x7fd579d91cf0, server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0) at rpc/virnetserverprogram.c:305 #12 0x00007fd578c03ff5 in virNetServerProcessMsg (srv=0x7fd579d95280, client=0x7fd579da5010, prog=0x7fd579d91cf0, msg=0x7fd579da87c0) at rpc/virnetserver.c:162 #13 0x00007fd578c040e4 in virNetServerHandleJob (jobOpaque=0x7fd579da91d0, opaque=0x7fd579d95280) at rpc/virnetserver.c:183 #14 0x00007fd578ae3ab6 in virThreadPoolWorker (opaque=0x7fd579d731a0) at util/virthreadpool.c:144 #15 0x00007fd578ae347b in virThreadHelper (data=0x7fd579d73110) at util/virthreadpthread.c:161 #16 0x00007fd5761e5851 in start_thread (arg=0x7fd56bbf2700) at pthread_create.c:301 #17 0x00007fd575b2c90d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 1 (Thread 0x7fd579597800 (LWP 19341)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fd578ae32a1 in virCondWait (c=0x7fd579d95490, m=0x7fd579d95438) at util/virthreadpthread.c:117 #2 0x00007fd578ae3f84 in virThreadPoolFree (pool=0x7fd579d95400) at util/virthreadpool.c:264 #3 0x00007fd578c06b93 in virNetServerDispose (obj=0x7fd579d95280) at rpc/virnetserver.c:1168 #4 0x00007fd578acecb4 in virObjectUnref (anyobj=0x7fd579d95280) at util/virobject.c:264 #5 0x00007fd5795e5031 in main (argc=1, argv=0x7fff3eaec128) at libvirtd.c:1500 Child process hung with the following backtrace: (gdb) t a a bt Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360) at pthread_mutex_lock.c:61 #3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8) at nss_files/files-pwd.c:40 #4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0) at ../nss/getXXbyYY_r.c:253 #5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031 #6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0, clearExistingCaps=true) at util/virutil.c:1388 #7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654 #8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0) at util/vircommand.c:2247 #9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0) at util/vircommand.c:2100 #10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1, ---Type <return> to continue, or q <return> to quit--- stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:3694 #11 0x00007fd5632af536 in qemuDomainObjStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, flags=0) at qemu/qemu_driver.c:5453 #12 0x00007fd5632af796 in qemuDomainCreateWithFlags (dom=0x7fd55c307d30, flags=0) at qemu/qemu_driver.c:5502 #13 0x00007fd5632af829 in qemuDomainCreate (dom=0x7fd55c307d30) at qemu/qemu_driver.c:5520 #14 0x00007fd578b91569 in virDomainCreate (domain=0x7fd55c307d30) at libvirt.c:8462 #15 0x00007fd5795ee7aa in remoteDispatchDomainCreate (server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0, rerr=0x7fd56bbf1aa0, args=0x7fd55c307d70) at remote_dispatch.h:2910 #16 0x00007fd5795ee6a0 in remoteDispatchDomainCreateHelper ( server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0, rerr=0x7fd56bbf1aa0, args=0x7fd55c307d70, ret=0x7fd55c305430) at remote_dispatch.h:2888 #17 0x00007fd578bfdbfa in virNetServerProgramDispatchCall ( prog=0x7fd579d91cf0, server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0) at rpc/virnetserverprogram.c:439 #18 0x00007fd578bfd771 in virNetServerProgramDispatch (prog=0x7fd579d91cf0, server=0x7fd579d95280, client=0x7fd579da5010, msg=0x7fd579da87c0) ---Type <return> to continue, or q <return> to quit--- at rpc/virnetserverprogram.c:305 #19 0x00007fd578c03ff5 in virNetServerProcessMsg (srv=0x7fd579d95280, client=0x7fd579da5010, prog=0x7fd579d91cf0, msg=0x7fd579da87c0) at rpc/virnetserver.c:162 #20 0x00007fd578c040e4 in virNetServerHandleJob (jobOpaque=0x7fd579da91d0, opaque=0x7fd579d95280) at rpc/virnetserver.c:183 #21 0x00007fd578ae3ab6 in virThreadPoolWorker (opaque=0x7fd579d731a0) at util/virthreadpool.c:144 #22 0x00007fd578ae347b in virThreadHelper (data=0x7fd579d73110) at util/virthreadpthread.c:161 #23 0x00007fd5761e5851 in start_thread (arg=0x7fd56bbf2700) at pthread_create.c:301 #24 0x00007fd575b2c90d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Expected results: No deadlock - getpwuid_r must be called BEFORE fork, and the results cached and passed in as a parameter to virSetUIDGID after fork, so that we aren't using non-async-signal-safe functions that deadlock the child, which in turn prevents us from locking up the parent. Additional info: In searching for duplicates, I found that bug 452629 described similar symptoms, but without a root cause analysis to the POSIX async-signal safety aspect.
getpwuid_r() has been called by libvirt between fork() and exec() since commit b7bd75c, which was in v0.8.5. (this code was in a different location, but was later moved into the general-purpose virSetUIDGID() function, and called by even more places. I haven't looked, but this code movement could make backporting a fix to older branches a bit tedious.
I hit this again today while testing a fix for bug 903433; I'm surprised that no one else has seemed to hit this issue.
Testing notes: in /etc/libvirt/qemu.conf, set 'user' and 'group' to qemu (not root), and restart libvirtd. As a sanity check, inspect /proc/nnn/status of a qemu process started by this libvirt; assuming default installation, it should have: Uid: 107 107 107 107 Gid: 107 107 107 107 ... Groups: 36 107 both pre- and post-patch (that is, this validates that we are correctly assigning supplemental group 36 (kvm) to the qemu process). I have not yet played with root-squash NFS in the context of this bug, but suspect it should be possible to write a test case where a file that is owned by the kvm group but NOT the qemu uid or gid should still be usable by a guest (ie. using root-squash NFS as proof that the cases where we have to fork a child to work around root-squash limitations still adds the right supplementary groups to mimic what the real qemu user can access). So far, I haven't found a RELIABLE way to trigger the deadlock pre-patch; I'm still messing around with running libvirtd under gdb, and trying to figure out where to put a breakpoint in glibc's getpwuid_r, followed by appropriate steps to cause one thread of libvirtd to do a uid lookup; the idea is that once the breakpoint is hit, I then let another thread run that forks and does a uid lookup in the child, to reliably prove that the pre-patch deadlock exists. If I come up with a reproducible formula (rather than just starting and stopping multiple domains in parallel while running a hook script in /etc/libvirt/hooks/qemu), I'll add it as another note.
v1 of potential upstream patch has been posted upstream; it will need a v2 depending on what gnulib gives us, but I plan on doing a scratch build soon. https://www.redhat.com/archives/libvir-list/2013-May/msg01600.html
Upstream is now in place with this series of commits: bfc183c 29fe5d7 fdb3bde ee777e9 75c1256 c1983ba and optionally a gnulib submodule update (I'll have to figure out how to avoid backporting that, as it messes with the build process)
In POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-July/msg00654.html
An additional issue was identified upstream that affects Debian's choice of using a non-primary group for running qemu, while leaving the files only owned by the primary group: https://www.redhat.com/archives/libvir-list/2013-August/msg00209.html But since RHEL doesn't use Debian's uid setup, it should not matter if that patch is not backported to RHEL.
HI Eric, I'm trying to verify this bug with latest libvirt, but not clear about how to trigger this bug, could u give some suggestion so i can verify it? Thanks!
Positive verification is easy - both pre-patch and post-patch, run with /etc/libvirt/qemu.conf set to a non-root user that belongs to more than one group, and ensure that qemu child processes are associated with all of the supplementary groups of that user rather than just the primary group. Negative verification (triggering the pre-patch race) is much harder. I happened to hit the race while running a hook script associated with domain startup, where all the hook script did was sleep 2 seconds (http://libvirt.org/hooks.html), then attempting to start multiple domains in parallel. That was enough to make multiple threads of libvirt be probing the group database while other threads were trying to fork; but even then, I only saw the deadlock twice across hundreds of attempts. I think you will only get reliable tests if you run libvirtd under gdb and put a breakpoint inside glibc's mutex within getpwuid_r, although I have not attempted to figure out how to set up such a test. I'm also trying to work on an LD_PRELOAD addition to libvirt/tests that could serve a similar purpose, but don't have that finished yet.
Hi, Eric Test with libvirt-0.10.2-22.el6.x86_64, the qemu processes groups always 0, this maybe a problem, details as following: Positive test with libvirt-0.10.2-21.el6.x86_64 1. Set user/group to "qemu", restart libvirt, then start a guest 2. Check the qemu child processes: # ps -eaL|grep qemu 10695 10695 ? 00:00:00 qemu-kvm 10695 10697 ? 00:00:00 qemu-kvm 10695 10698 ? 00:00:00 qemu-kvm # cat /proc/10698/status ... 0 Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 36 107 # cat /proc/10697/status ... 0 Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 36 107 # cat /proc/10695/status ... Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 36 107 ------------ Then update libvirt to libvirt-0.10.2-22.el6.x86_64, repeat step1&2: # ps -eaL|grep qemu 11006 11006 ? 00:00:00 qemu-kvm 11006 11008 ? 00:00:00 qemu-kvm 11006 11009 ? 00:00:00 qemu-kvm # cat /proc/11009/status ... 0 Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 0 # cat /proc/11008/status ... Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 0 # cat /proc/11006/status ... Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 0
(In reply to yanbing du from comment #14) > Hi, Eric > Test with libvirt-0.10.2-22.el6.x86_64, the qemu processes groups always 0, > this maybe a problem, details as following: Yes, it IS a problem. Reopening this bug. > ------------ > Then update libvirt to libvirt-0.10.2-22.el6.x86_64, repeat step1&2: > # ps -eaL|grep qemu > 11006 11006 ? 00:00:00 qemu-kvm > 11006 11008 ? 00:00:00 qemu-kvm > 11006 11009 ? 00:00:00 qemu-kvm > > # cat /proc/11009/status > ... 0 > Uid: 107 107 107 107 > Gid: 107 107 107 107 > Utrace: 0 > FDSize: 64 > Groups: 0 That should not be happening. I'm investigating...
*** Bug 997350 has been marked as a duplicate of this bug. ***
Now upstream as of commit 745aa55fb, and v2 is in POST on rhvirt-archives (would post the URL, but rhvirt-archives appears to be stuck since the 27th...)
(In reply to Eric Blake from comment #20) > Now upstream as of commit 745aa55fb, and v2 is in POST on rhvirt-archives > (would post the URL, but rhvirt-archives appears to be stuck since the > 27th...) here's the message id to look for, if the archives ever get unstuck <1377804152-29348-1-git-send-email-eblake@redhat.com>
*** Bug 1002115 has been marked as a duplicate of this bug. ***
*** Bug 1003488 has been marked as a duplicate of this bug. ***
Hi, Eric I just try to verify this bug with libvirt-0.10.2-24.el6.x86_64, the following was my verify steps, please help check that whether was it enough to verify this bug, thanks. Reproduce bug with libvirt-0.10.2-22.el6.x86_64 1. create new group "abc" and add the user "qemu" to it # id qemu uid=107(qemu) gid=107(qemu) groups=107(qemu),36(kvm),500(abc) 2. Set user/group to "qemu" in qemu.conf, restart libvirt, then start a guest 3. Check the qemu child processes: # ps -eaL|grep qemu 29517 29517 ? 00:00:03 qemu-kvm 29517 29543 ? 00:00:40 qemu-kvm 29517 29544 ? 00:00:00 qemu-kvm 29517 30038 ? 00:00:00 qemu-kvm # cat /proc/29544/status Name: qemu-kvm State: S (sleeping) Tgid: 29517 Pid: 29544 PPid: 1 TracerPid: 0 Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 0 492 Then update libvirt to libvirt-0.10.2-24.el6.x86_64, repeat step2&3, qemu child processes are associated with all of the supplementary groups of that user : # ps -eaL|grep qemu 30736 30736 ? 00:00:00 qemu-kvm 30736 30758 ? 00:00:00 qemu-kvm 30736 30759 ? 00:00:02 qemu-kvm 30736 30760 ? 00:00:00 qemu-kvm # cat /proc/30760/status Name: qemu-kvm State: S (sleeping) Tgid: 30736 Pid: 30760 PPid: 1 TracerPid: 0 Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 36 107 500 # cat /proc/30759/status Name: qemu-kvm State: S (sleeping) Tgid: 30736 Pid: 30759 PPid: 1 TracerPid: 0 Uid: 107 107 107 107 Gid: 107 107 107 107 Utrace: 0 FDSize: 64 Groups: 36 107 500
*** Bug 1005781 has been marked as a duplicate of this bug. ***
*** Bug 999230 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1581.html