Description of problem ====================== For workload which consists of creating lot of small files (see details below), memory consumption on gluster storage machines grows to the point swap is 100% utilized and some brick daemons crashes. This has been observed during testing of RHGS WA, so the reproducer (and evidence) contains installation of RHGS WA. While this has been observed on storage machines with 2GB ram only (which is way below expected production values), the fact that gluster filled up the swap 100% and then some bricks crashed is a recent behavior, not seen in gluster builds few weeks ago (exact versions available below). Version-Release number of selected component ============================================ glusterfs-3.12.2-16.el7rhgs.x86_64 glusterfs-api-3.12.2-16.el7rhgs.x86_64 glusterfs-cli-3.12.2-16.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-16.el7rhgs.x86_64 glusterfs-events-3.12.2-16.el7rhgs.x86_64 glusterfs-fuse-3.12.2-16.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-16.el7rhgs.x86_64 glusterfs-libs-3.12.2-16.el7rhgs.x86_64 glusterfs-rdma-3.12.2-16.el7rhgs.x86_64 glusterfs-server-3.12.2-16.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64 python2-gluster-3.12.2-16.el7rhgs.x86_64 tendrl-gluster-integration-1.6.3-9.el7rhgs.noarch vdsm-gluster-4.19.43-2.3.el7rhgs.noarch How reproducible ================ 1/1 Steps to Reproduce ================== 1. Install Gluster trusted storage pool via gdeploy on 6 virtual machines with 2 GB ram, using gdeploy config files[1] 2. Install dedicated client machine, and mount all gluster volumes there 3. Install RHGS WA on dedicated machine and just created trusted storage pool 4. Import the trusted pool into WA, with gluster volume profiling enabled 5. On client machine, prepare wikipedia tarball[2] 6. On client, cd to the mountpoint of beta volume and run: # bzcat /tmp/enwiki-latest-pages-articles.xml.bz2 | wiki-export-split.py --noredir --filenames=sha1 --sha1sum=wikipages.sha1 --max-files=1000000 This will start extracting wikipedia articles into individual pages. 7. Wait few hours. [1] https://github.com/usmqe/usmqe-setup/blob/master/gdeploy_config/volume_beta_arbiter_2_plus_1x2.create.conf https://github.com/usmqe/usmqe-setup/blob/master/gdeploy_config/volume_gama_disperse_4_plus_2x2.create.conf [2] https://github.com/usmqe/usmqe-setup/blob/master/test_setup.wiki_tarball.yml Actual results ============== Storage machines runs out of free memory and swap, some gluster bricks there crashes, switching the volume into read only mode stopping the workload before completion: ``` [root@mbukatov-usm1-client volume_beta_arbiter_2_plus_1x2]# bzcat /tmp/enwiki-latest-pages-articles.xml.bz2 | wiki-export-split.py --noredir --filenames=sha1 --sha1sum=wikipages.sha1 --max-files=1000000 Traceback (most recent call last): File "/usr/local/bin/wiki-export-split.py", line 222, in <module> sys.exit(main()) File "/usr/local/bin/wiki-export-split.py", line 216, in main return process_xml(sys.stdin, opts) File "/usr/local/bin/wiki-export-split.py", line 200, in process_xml parser.parse(xml_file) File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/lib64/python2.7/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 210, in feed self._parser.Parse(data, isFinal) File "/usr/lib64/python2.7/xml/sax/expatreader.py", line 307, in end_element self._cont_handler.endElement(name) File "/usr/local/bin/wiki-export-split.py", line 150, in endElement self.page.end() File "/usr/local/bin/wiki-export-split.py", line 97, in end self._file.close() File "/usr/lib64/python2.7/tempfile.py", line 412, in close self.file.close() IOError: [Errno 107] Transport endpoint is not connected close failed in file object destructor: IOError: [Errno 30] Read-only file system [root@mbukatov-usm1-client volume_beta_arbiter_2_plus_1x2] ``` See attached usm1 screenshots. Expected results ================ Gluster bricks doesn't crash, the workload finishes after some time. Additional info =============== I retried that on older setup (which I used few weeks ago and haven't noticed the problem), setting the ram to 2 GB there and enabling profiling, to achieve the same environment. I could see some growing memory utilization, but nothing has crashed and the workload finishes with success. The working version: glusterfs-3.12.2-14.el7rhgs.x86_64 glusterfs-api-3.12.2-14.el7rhgs.x86_64 glusterfs-cli-3.12.2-14.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-14.el7rhgs.x86_64 glusterfs-events-3.12.2-14.el7rhgs.x86_64 glusterfs-fuse-3.12.2-14.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-14.el7rhgs.x86_64 glusterfs-libs-3.12.2-14.el7rhgs.x86_64 glusterfs-rdma-3.12.2-14.el7rhgs.x86_64 glusterfs-server-3.12.2-14.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64 python2-gluster-3.12.2-14.el7rhgs.x86_64 tendrl-gluster-integration-1.6.3-7.el7rhgs.noarch vdsm-gluster-4.19.43-2.3.el7rhgs.noarch
Created attachment 1477364 [details] screenshot usm1 (affected cluster): host dashboard with crash and growing memory utilization visible See 2 events in Host dashboard of affected storage machine: 1st the memory runs out (Swap Utilization chart peaks at 100%), brick crashes (see Brick Down counter) and copying stops (Total Brick Utilization chart stops gworing, Brick IOPS chart shows goes down to zero). Then the workload is stopped (no IOPS reported), but the memory keeps growing until it reaches 100% of swap utilization again and then 2nd brick daemon on the machine crashes.
Created attachment 1477367 [details] screenshot of host dashboard from cluster using older builds, which doesn't crash during same workload Screenshot from another cluster (usm1), which uses older gluster and wa builds, where gluster doesn't crash during the same workload.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607