Description of problem: Performing small db write operations that normally would take less than 1 second are taking a long time to complete: sh-4.2# time heketi-cli cluster create Cluster id: fe0d05d6b73c149c7e482749a50244e8 real 1m55.348s user 0m0.178s sys 0m0.084s More complex operations are also taking a long time. System under test is a scale test with approx. 1927 volumes (with brck mux enabled, OCS setup). Version-Release number of selected component (if applicable): Server versions: [root@dhcp46-207 ~]# oc rsh glusterfs-storage-pp52j sh-4.2# rpm -qa | grep gluster glusterfs-api-6.0-9.el7rhgs.x86_64 glusterfs-fuse-6.0-9.el7rhgs.x86_64 python2-gluster-6.0-9.el7rhgs.x86_64 glusterfs-server-6.0-9.el7rhgs.x86_64 glusterfs-libs-6.0-9.el7rhgs.x86_64 glusterfs-6.0-9.el7rhgs.x86_64 glusterfs-client-xlators-6.0-9.el7rhgs.x86_64 glusterfs-cli-6.0-9.el7rhgs.x86_64 glusterfs-geo-replication-6.0-9.el7rhgs.x86_64 gluster-block-0.2.1-34.el7rhgs.x86_64 Client versions: [root@dhcp46-249 ~]# rpm -qa | grep gluste glusterfs-client-xlators-6.0-9.el7rhgs.x86_64 glusterfs-api-6.0-9.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-23.el7.x86_64 glusterfs-6.0-9.el7rhgs.x86_64 glusterfs-cli-6.0-9.el7rhgs.x86_64 glusterfs-fuse-6.0-9.el7rhgs.x86_64 glusterfs-libs-6.0-9.el7rhgs.x86_64 How reproducible: one instance Cluster located at https://bugzilla.redhat.com/show_bug.cgi?id=1732703#c6 Expected results: Even at scale write performance of a volume should stay fairly steady, however it looks like things have slowed down to an unusual degree.
[root@dhcp46-249 ~]# netstat -tnap | grep 50974 tcp 0 0 10.70.46.249:1005 10.70.46.224:49153 ESTABLISHED 50974/glusterfs tcp 0 0 10.70.46.249:1012 10.70.46.3:49153 ESTABLISHED 50974/glusterfs tcp 0 0 10.70.46.249:1023 10.70.46.224:24007 ESTABLISHED 50974/glusterfs tcp 0 0 10.70.46.249:1015 10.70.46.249:49153 ESTABLISHED 50974/glusterfs
Created attachment 1600127 [details] profile info
Created attachment 1600128 [details] State dump
Who's looking at this?
I am taking this bug out of 3.5.0 based on comment 22, if this is no longer reproducible we should ideally be able to close it.