Bug 855787 - glusterfs: client crash while testing statedump
glusterfs: client crash while testing statedump
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
2.0
Unspecified Unspecified
high Severity unspecified
: ---
: ---
Assigned To: Raghavendra Bhat
Sudhir D
:
Depends On: 885008
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-10 05:43 EDT by Sachidananda Urs
Modified: 2013-09-23 18:33 EDT (History)
3 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0qa8
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-23 18:33:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Client log file (93.20 KB, application/octet-stream)
2012-09-10 05:44 EDT, Sachidananda Urs
no flags Details
Core file (3.49 MB, application/x-bzip)
2012-09-10 05:45 EDT, Sachidananda Urs
no flags Details

  None (edit)
Description Sachidananda Urs 2012-09-10 05:43:07 EDT
Description of problem:
Client crashes when IO is high and graph is changed (gluster volume set/unset).


Version-Release number of selected component (if applicable):
Update 2:
glusterfs 3.3.0rhs built on Sep 10 2012 00:49:11


Steps to Reproduce:
1. Do some IO intensive work on the client.
2. Change the graph multiple times.


Additional info:


#0  0x00007f86176e9214 in client3_1_flush_cbk (req=0x7f8616400c50, iov=0x7f8616400c90, count=<value optimized out>, 
    myframe=0x7f861aeeff94) at client3_1-fops.c:865
#1  0x00007f861be810c5 in rpc_clnt_handle_reply (clnt=0x15b14b0, pollin=0x1edc4c0) at rpc-clnt.c:788
#2  0x00007f861be818c0 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x15b14e0, 
    event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:907
#3  0x00007f861be7d018 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, 
    data=<value optimized out>) at rpc-transport.c:489
#4  0x00007f8618744954 in socket_event_poll_in (this=0x15c0f10) at socket.c:1677
#5  0x00007f8618744a37 in socket_event_handler (fd=<value optimized out>, idx=5, data=0x15c0f10, poll_in=1, 
    poll_out=0, poll_err=<value optimized out>) at socket.c:1792
#6  0x00007f861c0c7d84 in event_dispatch_epoll_handler (event_pool=0x1524e00) at event.c:785
#7  event_dispatch_epoll (event_pool=0x1524e00) at event.c:847
#8  0x00000000004073ca in main (argc=<value optimized out>, argv=0x7fffeae4d888) at glusterfsd.c:1689
Comment 1 Sachidananda Urs 2012-09-10 05:44:58 EDT
Created attachment 611372 [details]
Client log file
Comment 2 Sachidananda Urs 2012-09-10 05:45:45 EDT
Created attachment 611373 [details]
Core file
Comment 4 Amar Tumballi 2012-09-17 01:24:20 EDT
The below patch should fix it...

amar@ganaka:~/work/glusterfs$ git diff
diff --git a/xlators/performance/write-behind/src/write-behind.c b/xlators/performance/write-behind/src/write-behind.c
index ad1e5f0..59cbd00 100644
--- a/xlators/performance/write-behind/src/write-behind.c
+++ b/xlators/performance/write-behind/src/write-behind.c
@@ -2604,7 +2604,8 @@ wb_flush_helper (call_frame_t *frame, xlator_t *this, fd_t *fd, dict_t *xdata)
                 wb_request_unref (local->request);
         }
 
-        if (conf->flush_behind) {
+        int flag = conf->flush_behind;
+        if (flag) {
                 flush_frame = copy_frame (frame);
                 if (flush_frame == NULL) {
                         op_errno = ENOMEM;
@@ -2628,7 +2629,7 @@ wb_flush_helper (call_frame_t *frame, xlator_t *this, fd_t *fd, dict_t *xdata)
                 STACK_DESTROY (process_frame->root);
         }
 
-        if (conf->flush_behind) {
+        if (flag) {
                 STACK_UNWIND_STRICT (flush, frame, op_ret, op_errno, NULL);
         }
 
--------------
also, if the proposed patch @ http://review.gluster.org/3947 goes in, then this race won't exist.
Comment 5 Amar Tumballi 2012-10-06 11:01:59 EDT
patch accepted upstream
Comment 6 Sachidananda Urs 2012-12-18 02:07:17 EST
Crash still happens and related to bug: https://bugzilla.redhat.com/show_bug.cgi?id=885008

Will continue testing once bug 885008 is fixed.
Comment 7 Amar Tumballi 2012-12-20 02:50:03 EST
marking as MODIFIED as there is no work for this particular bug, but will keep it in MODIFIED till the blocker bug for verification is fixed.
Comment 8 Sachidananda Urs 2013-03-03 08:45:51 EST
Verified with load on multiple clients and series of graph changes in a loop on the servers. No crashes seen.
Comment 9 Scott Haines 2013-09-23 18:33:22 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.