Description of problem: When a number of bricks are high in a machine then blivet takes huge time to form a device tree. We are using python-blivet in a gluster-integration to fetch device details using mount points When a number of bricks high in the same node then it took a lot of times to collect all devices details and form a device tree. the blivet is collecting very high-level device details but we are using only a few basic device details from that. We have to replace blivet dependency with some other tools to reduce sync time. Version-Release number of selected component (if applicable): python-blivet-0.61.15.71-1.el7.noarch How reproducible: Create 300 to 400 bricks in a 4gb ram and 4 CPU machine and try: import blivet b = blivet.Blivet() b.reset() Steps to Reproduce: 1. 2. 3. Actual results: Blivet reset took huge time so gluster-integration sync took more time Expected results: gluster-integration sync should execute quickly Additional info:
blivet reset is sometimes fast and sometimes very slow
PR is under review: https://github.com/Tendrl/gluster-integration/pull/704
Do I understand this right that blivet should not be installed at all on storage machines?
yes, I have removed blivet dependency from spec files and I removed all blivet import statements. So blivet should not installed in any node.
This bug is taken out from BU3
The python-blivet dependency was removed from tendrl-gluster-integration package Old version: # rpm -qR tendrl-gluster-integration-1.6.3-13.el7rhgs | grep blivet python-blivet # New version: # rpm -qR tendrl-gluster-integration-1.6.3-15.el7rhgs.noarch | grep blivet # Package python-blivet is still dependency for vdsm-gluster which is dependency for redhat-storage-server, so this package still might be installed on the Gluster Storage servers. But it is not dependency for any Web Administration component. Regression testing doesn't reveal any related regression. Version-Release number of selected component: RHGS Web Administration Server: Red Hat Enterprise Linux Server release 7.6 (Maipo) collectd-5.7.2-3.1.el7rhgs.x86_64 collectd-ping-5.7.2-3.1.el7rhgs.x86_64 etcd-3.2.7-1.el7.x86_64 libcollectdclient-5.7.2-3.1.el7rhgs.x86_64 python-etcd-0.4.5-2.el7rhgs.noarch rubygem-etcd-0.3.0-2.el7rhgs.noarch tendrl-ansible-1.6.3-11.el7rhgs.noarch tendrl-api-1.6.3-13.el7rhgs.noarch tendrl-api-httpd-1.6.3-13.el7rhgs.noarch tendrl-commons-1.6.3-17.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-21.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-3.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-21.el7rhgs.noarch tendrl-node-agent-1.6.3-18.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-3.el7rhgs.noarch tendrl-ui-1.6.3-15.el7rhgs.noarch Red Hat Gluster Storage Server: Red Hat Enterprise Linux Server release 7.6 (Maipo) collectd-5.7.2-3.1.el7rhgs.x86_64 collectd-ping-5.7.2-3.1.el7rhgs.x86_64 glusterfs-3.12.2-45.el7rhgs.x86_64 glusterfs-api-3.12.2-45.el7rhgs.x86_64 glusterfs-cli-3.12.2-45.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-45.el7rhgs.x86_64 glusterfs-events-3.12.2-45.el7rhgs.x86_64 glusterfs-fuse-3.12.2-45.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-45.el7rhgs.x86_64 glusterfs-libs-3.12.2-45.el7rhgs.x86_64 glusterfs-rdma-3.12.2-45.el7rhgs.x86_64 glusterfs-server-3.12.2-45.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch libcollectdclient-5.7.2-3.1.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.4.x86_64 python2-gluster-3.12.2-45.el7rhgs.x86_64 python-etcd-0.4.5-2.el7rhgs.noarch tendrl-collectd-selinux-1.5.4-3.el7rhgs.noarch tendrl-commons-1.6.3-17.el7rhgs.noarch tendrl-gluster-integration-1.6.3-15.el7rhgs.noarch tendrl-node-agent-1.6.3-18.el7rhgs.noarch tendrl-selinux-1.5.4-3.el7rhgs.noarch
I've performed some basic performance measurement between the previous and new version on storage nodes with 4 vCPUs, 8GB RAM and with higher number of storage devices (24, divided into ~160 partitions), bricks (55) and Gluster Volumes (33). I've imported the cluster into WA and let it run 2 days. On the older version: * the average load was between 2 and 2.5, * CPU utilization was above 30%. On the new version: * the average load was around 0.6, * CPU utilization was around 13%. Also memory utilization seems to be slightly lower on the new version, but the values are quite small, so the difference is not significant. Version-Release number of selected component: Previous version: Red Hat Enterprise Linux Server release 7.6 (Maipo) Red Hat Gluster Storage Server 3.4 tendrl-collectd-selinux-1.5.4-3.el7rhgs.noarch tendrl-commons-1.6.3-15.el7rhgs.noarch tendrl-gluster-integration-1.6.3-13.el7rhgs.noarch tendrl-node-agent-1.6.3-15.el7rhgs.noarch tendrl-selinux-1.5.4-3.el7rhgs.noarch New version: Red Hat Enterprise Linux Server release 7.6 (Maipo) Red Hat Gluster Storage Server 3.4 tendrl-collectd-selinux-1.5.4-3.el7rhgs.noarch tendrl-commons-1.6.3-17.el7rhgs.noarch tendrl-gluster-integration-1.6.3-15.el7rhgs.noarch tendrl-node-agent-1.6.3-18.el7rhgs.noarch tendrl-selinux-1.5.4-3.el7rhgs.noarch
Verifying based on comment 8 and comment 9. >> VERIFIED
(In reply to Daniel Horák from comment #9) > On the older version: > * the average load was between 2 and 2.5, > * CPU utilization was above 30%. > > On the new version: > * the average load was around 0.6, > * CPU utilization was around 13%. > > Also memory utilization seems to be slightly lower on the new version, but > the > values are quite small, so the difference is not significant. The CPU and memory utilization mentioned in comment 9 is related to tendrl-gluster-integration service.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0660