Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1600057 - crash on glusterfs_handle_brick_status of the glusterfsd
crash on glusterfs_handle_brick_status of the glusterfsd
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: core (Show other bugs)
3.3
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.4.0
Assigned To: hari gowtham
Rajesh Madaka
:
Depends On:
Blocks: 1503137
  Show dependency treegraph
 
Reported: 2018-07-11 06:29 EDT by hari gowtham
Modified: 2018-09-18 02:38 EDT (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.12.2-14
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1600451 (view as bug list)
Environment:
Last Closed: 2018-09-04 02:50:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 02:51 EDT

  None (edit)
Description hari gowtham 2018-07-11 06:29:42 EDT
Description of problem:
On a WA setup, the glusterfsds crash at some random point which might be because of the race. 
This bug is to just avoid the crash from happening. RCA tracked separately.

They crash at:

Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.
[New LWP 3816]
[New LWP 3817]
[New LWP 3823]
[New LWP 3813]
[New LWP 3814]
[New LWP 3815]
[New LWP 3812]

warning: Could not load shared library symbols for /lib64/libnss_sss.so.2.
Do you need "set solib-search-path" or "set sysroot"?

warning: Could not load shared library symbols for SINFO:      0x.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfsd -s bxts470192.eu.rabonet.com --volfile-id prod_xvavol.bxts'.
Program terminated with signal 11, Segmentation fault.
#0  glusterfs_handle_brick_status (req=0x7fae38001910) at glusterfsd-mgmt.c:1029
1029            any = active->first;
(gdb) bt
#0  glusterfs_handle_brick_status (req=0x7fae38001910) at glusterfsd-mgmt.c:1029
#1  0x00007fae4a5444f2 in synctask_wrap (old_task=<optimized out>) at syncop.c:375
#2  0x00007fae48b87d40 in ?? () from /lib64/libc.so.6
#3  0x0000000000000000 in ?? ()


Version-Release number of selected component (if applicable):
3.3

How reproducible:
rarely on WA cluster

Steps to Reproduce:
1. create a gluster volume
2. import to WA
3. make use of the volume. they crash after a certain time.

Actual results:
Crashes once in a while.

Expected results:
shouldn't crash.

Additional info:
Exact way to reproduce this is unknown and the details we have so far are, it  looks like a race between get-state detail and profile command in various order.
These are the commands the WA set up issues on gluster during the crash.

There are times these commands work will in various orders. once in a while they crash.

The RCA for this is tracked using the bug :
https://bugzilla.redhat.com/show_bug.cgi?id=1596371

There is a similar bug to track the RCA :
https://bugzilla.redhat.com/show_bug.cgi?id=1576726

Both end up in a situation which is not supposed to happen in the same environment.
Comment 9 Rajesh Madaka 2018-08-24 07:24:22 EDT
I have followed steps mentioned above in current bug.

After gluster import into WA , observed for more than 12 hours,

Didn't find any glusterfsd crashes.

Gluster-build version:

glusterfs-server-3.12.2-16
Comment 10 errata-xmlrpc 2018-09-04 02:50:20 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.