Bug 1208176
| Summary: | Race starting multiple libvirtd user sessions at the same time | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Richard W.M. Jones <rjones> |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.1 | CC: | agedosier, berrange, clalancette, dyuan, extras-qa, itamar, jforbes, jsuchane, kchamart, laine, libvirt-maint, mkletzan, mzhan, rbalakri, rjones, shyu, veillard, virt-maint, yafu, zhwang |
| Target Milestone: | rc | Keywords: | Upstream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-1.2.15-1.el7 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1200149 | Environment: | |
| Last Closed: | 2015-11-19 06:26:50 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1200149 | ||
| Bug Blocks: | 910269, 1194593 | ||
|
Description
Richard W.M. Jones
2015-04-01 14:53:16 UTC
I've just pushed the patch upstream:
commit be78814ae07f092d9c4e71fd82dd1947aba2f029
Author: Michal Privoznik <mprivozn>
AuthorDate: Thu Apr 2 14:41:17 2015 +0200
Commit: Michal Privoznik <mprivozn>
CommitDate: Wed Apr 15 13:39:13 2015 +0200
virNetSocketNewConnectUNIX: Use flocks when spawning a daemon
https://bugzilla.redhat.com/show_bug.cgi?id=1200149
Even though we have a mutex mechanism so that two clients don't spawn
two daemons, it's not strong enough. It can happen that while one
client is spawning the daemon, the other one fails to connect.
Basically two possible errors can happen:
error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': Connection refused
or:
error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': No such file or directory
The problem in both cases is, the daemon is only starting up, while we
are trying to connect (and fail). We should postpone the connecting
phase until the daemon is started (by the other thread that is
spawning it). In order to do that, create a file lock 'libvirt-lock'
in the directory where session daemon would create its socket. So even
when called from multiple processes, spawning a daemon will serialize
on the file lock. So only the first to come will spawn the daemon.
Tested-by: Richard W. M. Jones <rjones>
Signed-off-by: Michal Privoznik <mprivozn>
v1.2.14-174-gbe78814
I can produce this bug with build libvirt-1.2.8-16.el7.x86_64 1. log in with NON-root user # su - test1 Last login: Thu May 28 16:02:04 CST 2015 on pts/1 $ virsh list Id Name State ---------------------------------------------------- $ ps aux |grep libvirtd test1 16546 1.5 0.1 645588 14076 ? Sl 10:50 0:00 /usr/sbin/libvirtd --timeout=30 test1 16575 0.0 0.0 112640 960 pts/1 S+ 10:50 0:00 grep --color=auto libvirtd root 27111 0.0 0.2 1024300 18636 ? Ssl Jun23 0:00 /usr/sbin/libvirtd 2. run command as below, see libvirtd process exit 1 with error $ killall libvirtd ; for i in `seq 1 5`; do virsh list >/tmp/log$i 2>&1 & done; libvirtd(27111): Operation not permitted libvirtd: no process found [1] 16619 [2] 16620 [3] 16621 [4] 16622 [5] 16623 $ virsh list Id Name State ---------------------------------------------------- [1] Done virsh list > /tmp/log$i 2>&1 [2] Done virsh list > /tmp/log$i 2>&1 [3] Done virsh list > /tmp/log$i 2>&1 [4]- Done virsh list > /tmp/log$i 2>&1 [5]+ Exit 1 virsh list > /tmp/log$i 2>&1 3. check log, find errors as below $ grep Fail /tmp/log* /tmp/log5:error: Failed to connect socket to '/home/test1/.cache/libvirt/libvirt-sock': Connection refused $ cat /tmp/log? Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/home/test1/.cache/libvirt/libvirt-sock': Connection refused Verify this bug with build libvirt-1.2.16-1.el7.x86_64 1. log in with NON-root user # su - test1 Last login: Fri Jun 26 10:40:18 HKT 2015 on pts/11 $ ps aux |grep libvirtd test1 19200 0.0 0.0 112640 964 pts/11 S+ 10:41 0:00 grep --color=auto libvirtd root 31153 0.0 0.0 155440 3780 pts/3 S+ Jun17 0:00 vim /etc/libvirt/libvirtd.conf $ virsh list Id Name State ---------------------------------------------------- $ ps aux |grep libvirtd test1 19204 5.0 0.2 811588 17896 ? Sl 10:41 0:00 /usr/sbin/libvirtd --timeout=30 test1 19243 0.0 0.0 112640 964 pts/11 S+ 10:41 0:00 grep --color=auto libvirtd root 31153 0.0 0.0 155440 3780 pts/3 S+ Jun17 0:00 vim /etc/libvirt/libvirtd.conf 2. run command as below $ killall libvirtd ; for i in `seq 1 5`; do virsh list >/tmp/log$i 2>&1 & done; [1] 19246 [2] 19247 [3] 19248 [4] 19249 [5] 19250 [1] Done virsh list > /tmp/log$i 2>&1 [2] Done virsh list > /tmp/log$i 2>&1 [3] Done virsh list > /tmp/log$i 2>&1 [4]- Done virsh list > /tmp/log$i 2>&1 [5]+ Done virsh list > /tmp/log$i 2>&1 3. check log, no Failed log existed $ grep Fail /tmp/log* $ cat /tmp/log? Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- Id Name State ---------------------------------------------------- reproduce steps for several times, always get the same result, move to verified (In reply to vivian zhang from comment #3) > reproduce steps for several times, always get the same result, move to > verified Did you actually forgot to change bug status? :-) (In reply to Michal Privoznik from comment #4) > (In reply to vivian zhang from comment #3) > > > reproduce steps for several times, always get the same result, move to > > verified > > Did you actually forgot to change bug status? :-) yes, thanks for your reminder Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2202.html |