Bug 1774711 - vm paused by io-error, crashed on resume, gf_thread_vcreate errors in logs
Summary: vm paused by io-error, crashed on resume, gf_thread_vcreate errors in logs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: io-threads
Version: 6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Mohit Agrawal
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-20 18:53 UTC by Darrell
Modified: 2020-02-20 04:28 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-02-20 04:28:53 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
libvirtd log for affected VM (14.80 KB, text/plain)
2019-11-20 18:53 UTC, Darrell
no flags Details
state dump for affected volume (221.63 KB, text/plain)
2019-11-20 18:53 UTC, Darrell
no flags Details
gluster vol info (1.08 KB, text/plain)
2019-11-20 18:54 UTC, Darrell
no flags Details

Description Darrell 2019-11-20 18:53:05 UTC
Created attachment 1638202 [details]
libvirtd log for affected VM

Description of problem: vm locked up on io to gluster volume using libgfapi mounts in ovirt. Crashed when attempting to resume. Interestingly, this seems to happen approximately every 10 days on one volume.

[2019-11-20 17:48:42.189605] E [MSGID: 101072] [common-utils.c:4030:gf_thread_vc
reate] 0-gDBv2-io-threads: Thread creation failed [Resource temporarily unavaila
ble]


Version-Release number of selected component (if applicable):
Gluster 6, various versions, latest from 6.6

How reproducible:
has been repeating approximately every 10 days since upgrading to Gluster 6

Steps to Reproduce:
1. start vm
2. wait ~10 days
3. vm crashes

Actual results:
vm locks up and crashes when attempting to resume

Expected results:
vm does not lock up using gluster volume

Additional info:
This particular volume hosts a RRD database for Observium using rrd-daemon, so there is a heavy write load every 30 minutes. Volume had performance.io-thread-count = 32 when it crashed, just turned it up to 64 as a test.

Comment 1 Darrell 2019-11-20 18:53:44 UTC
Created attachment 1638203 [details]
state dump for affected volume

Comment 2 Darrell 2019-11-20 18:54:05 UTC
Created attachment 1638204 [details]
gluster vol info

Comment 3 Mohit Agrawal 2020-02-19 14:20:12 UTC
We fixed a leak issue(https://bugzilla.redhat.com/show_bug.cgi?id=1768726) in release 6.7. 
I believe you will not face the issue after upgrade the gluster on 6.7 release.
Kindly upgrade the current gluster version to 6.7 to resolve the same.

Comment 4 Darrell 2020-02-20 04:28:53 UTC
So far the VM that has been crashing regularly has been up and stable for 22 days after upgrading the servers to 6.7, so I think you probably got it. Thanks for the followup!


Note You need to log in before you can comment on or make changes to this bug.