Bug 786139

Summary: Slow virsh interaction in ricci
Product: Red Hat Enterprise Linux 6 Reporter: Vlastimil Holer <vlastimil.holer>
Component: ricciAssignee: Chris Feist <cfeist>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: bugproxy, cluster-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-03 22:50:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Contains screenshots with Error and luci logs while attempting to create a new cluster. none

Description Vlastimil Holer 2012-01-31 15:16:57 UTC
Description of problem:
I have a RHCS two-node cluster with configuration managed/distributed by ccs/ricci. Ricci daemon is by default executed under 'ricci' user. With this settings even simple command takes tens of seconds to proceed. E.g.:

$ time ccs -h localhost --lsnodes
ceriha1.hb: nodeid=1
ceriha2.hb: nodeid=2

real	0m21.252s
user	0m0.097s
sys	0m0.028s

During the execution, ricci waits most of the time for virsh to finish:

ricci -u ricci
  ├─virsh nodeinfo
  └─{ricci}

but virsh timeouts on communication with libvirtd through socket:
...
connect(6, {sa_family=AF_FILE, path=@"/var/lib/ricci/.libvirt/libvirt-sock"}, 110) = -1 ECONNREFUSED (Connection refused)
nanosleep({1, 900000000}, NULL)         = 0
connect(6, {sa_family=AF_FILE, path=@"/var/lib/ricci/.libvirt/libvirt-sock"}, 110) = -1 ECONNREFUSED (Connection refused)
nanosleep({2, 0}, NULL)                 = 0
connect(6, {sa_family=AF_FILE, path=@"/var/lib/ricci/.libvirt/libvirt-sock"}, 110) = -1 ECONNREFUSED (Connection refused)
gettid()                                = 12788
close(6)                                = 0
gettid()                                = 12788
gettid()                                = 12788
gettid()                                = 12788
write(2, "error: ", 7)                  = 7
write(2, "Failed to reconnect to the hyper"..., 38) = 38
write(2, "error: ", 7)                  = 7
write(2, "no valid connection\n", 20)   = 20
write(2, "error: ", 7)                  = 7
write(2, "Failed to connect socket to '@/v"..., 88) = 88
fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa9e7a55000
write(1, "\n", 1)                       = 1
exit_group(1)                           = ?

We use default libvirt configuration and connection from common user shell works fine
$ virsh nodeinfo >/dev/null; echo $?
0

If I run ricci daemon under root, i.e. change RUNASUSER="root" in
/etc/init.d/ricci, the command successfully ends quickly:

$ time ccs -h localhost --lsnodes
ceriha1.hb: nodeid=1
ceriha2.hb: nodeid=2

real	0m0.300s
user	0m0.093s
sys	0m0.039s

Version-Release number of selected component (if applicable):
CentOS release 6.2 (Final)
libvirt-0.9.4-23.el6_2.4.x86_64
ricci-0.16.2-43.el6.x86_64
ccs-0.16.2-43.el6.x86_64

- libvirt in default settings
- no SElinux

Comment 2 Chris Feist 2012-01-31 15:32:43 UTC
This is an issue caused by changes to libvirt, we're working on determining the best place to make a fix.

For now, this issue can be worked around on the cluster nodes with the
following commands:
mkdir /var/lib/ricci/.libvirt
chown ricci.ricci /var/lib/ricci/.libvirt

Thanks!
Chris

Comment 3 Vlastimil Holer 2012-01-31 15:54:18 UTC
Quickfix helped, thanks!

Comment 4 Lon Hohberger 2012-01-31 19:14:23 UTC
This is intended to be addressed in libvirt.

See bug 785038

Comment 5 Chris Feist 2012-02-03 22:50:05 UTC
This fix in libvirt will not require any changes to ricci.

*** This bug has been marked as a duplicate of bug 785038 ***

Comment 6 Ryan McCabe 2012-03-23 14:16:11 UTC
*** Bug 783027 has been marked as a duplicate of this bug. ***

Comment 7 IBM Bug Proxy 2012-03-23 14:24:21 UTC
Created attachment 572292 [details]
Contains screenshots with Error and luci logs while attempting to create a new cluster.

Comment 8 Chris Feist 2012-04-30 23:02:07 UTC
This has been fixed by a 6.2 zstream fix in libvirt-0.9.4-23.el6_2.7.

Please let me know if you see this issue after upgrading to the latest version of libvirt.