Description of problem: libvirt is trying to collect some statistics from gluster api. An api call (pub_glfs_init) is stuck, this is hanging libvirt. Version-Release number of selected component (if applicable): RHEL 7.4 glusterfs-3.8.4-52.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.5.x86_64 How reproducible: N/A Steps to Reproduce: 1. N/A 2. 3. Actual results: Libvirt is stuck waiting on gluster api Expected results: Libvirt should not be stuck Additional info:
This is a RHHI deployment from the case, and libfapi is not supported. Can you share how the customer enabled this? Can they move back to supported fuse access?
Meanwhile, it could be because all the trusted ports are exhausted. Could you please try the following: # gluster volume set VOLNAME server.allow-insecure on I see that, this option is set only on data volume, could you enable it on other volumes? Also, edit the /etc/glusterfs/glusterd.vol in each Red Hat Gluster Storage node, and add the following setting: option rpc-auth-allow-insecure on This allows gfapi clients to communicate with glusterd even with untrusted ports. This requires glusterd restart on all the nodes, executed one after the other.
Needinfo for comment 10
We just had similar problem- gluster cluster had problems due to network problems, so healing started. And we got libvirtd stuck on all servers. cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option ping-timeout 0 option event-threads 1 # option transport.address-family inet6 # option base-port 49152 option rpc-auth-allow-insecure on end-volume Thank you!
(In reply to Need Real Name from comment #27) > We just had similar problem- gluster cluster had problems due to network > problems, so healing started. > And we got libvirtd stuck on all servers. > > cat /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option ping-timeout 0 > option event-threads 1 > # option transport.address-family inet6 > # option base-port 49152 > option rpc-auth-allow-insecure on > end-volume > > > Thank you! Are you using gfapi to access the disks on gluster volume?
yes, we use it from kmv/libvirt.
btw, what , may be, is interesting here- we run gluster and kvm with libvirt on the same hosts, i.e. when network was down libvirt still should be able to talk to gluster, but libvirt stuck, we also run pacemaker, which runs libvirt cli quite often to check vm's state, I guess this can "helps" to libvirt stuck too. Thank you!
Still have this problem...
btw, look like it is more likely can be triggered on 2 bricks setup with arbiter, then on 2 or 3 bricks...
Is it reproducible on the local setup? Either on QE or GSS setup, if so can you share the setup for debugging.
Hello! Looks like we don't have this problem while running gluster 4.1. Thank you!