Bug 1097102 - glusterfsd crashes while doing stress tests
Summary: glusterfsd crashes while doing stress tests
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: RHGS 3.0.0
Assignee: Krutika Dhananjay
QA Contact: Sachidananda Urs
URL:
Whiteboard:
Depends On:
Blocks: 1104915
TreeView+ depends on / blocked
 
Reported: 2014-05-13 07:26 UTC by Sachidananda Urs
Modified: 2015-05-13 17:00 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.6.0.17-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1104915 (view as bug list)
Environment:
Last Closed: 2014-09-22 19:37:23 UTC
Embargoed:


Attachments (Terms of Use)
Brick log (532.62 KB, application/x-bzip)
2014-05-13 07:26 UTC, Sachidananda Urs
no flags Details
Core file (8.69 MB, application/x-bzip)
2014-05-13 07:30 UTC, Sachidananda Urs
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1278 0 normal SHIPPED_LIVE Red Hat Storage Server 3.0 bug fix and enhancement update 2014-09-22 23:26:55 UTC

Description Sachidananda Urs 2014-05-13 07:26:10 UTC
Created attachment 895022 [details]
Brick log

Description of problem:

glusterfsd core dumped while doing some intense IO on machines having 60 drives.

Backtrace:
 
(gdb) bt
#0  uuid_unpack (in=0x8 <Address 0x8 out of bounds>, uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
#1  0x00007feeba9e19d6 in uuid_unparse_x (uu=<value optimized out>, out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9", 
    fmt=0x7feebaa08e00 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55
#2  0x00007feeba9be837 in uuid_utoa (uuid=0x8 <Address 0x8 out of bounds>) at common-utils.c:2138
#3  0x00007feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428
#5  0x00007feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, client=<value optimized out>) at posix.c:2550
#6  0x00007feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at client_t.c:368
#7  0x00007feeab77ed48 in server_connection_cleanup (this=0x2316390, client=0x27724a0, flags=<value optimized out>)
    at server-helpers.c:354
#8  0x00007feeab77ae2c in server_rpc_notify (rpc=<value optimized out>, xl=0x2316390, event=<value optimized out>, data=0x2bf51c0)
    at server.c:527
#9  0x00007feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720
#10 0x00007feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata=<value optimized out>, event=<value optimized out>, data=0x2bf51c0)
    at rpcsvc.c:758
#11 0x00007feeba778638 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>)
    at rpc-transport.c:512
#12 0x00007feeb115e971 in socket_event_poll_err (fd=<value optimized out>, idx=<value optimized out>, data=0x2bf51c0, 
    poll_in=<value optimized out>, poll_out=0, poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x2bf51c0, poll_in=<value optimized out>, poll_out=0, 
    poll_err=0) at socket.c:2240
#14 0x00007feeba9fc6a7 in event_dispatch_epoll_handler (event_pool=0x22e2d00) at event-epoll.c:384
#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
#16 0x0000000000407e93 in main (argc=19, argv=0x7fffea6c7f88) at glusterfsd.c:2023

Version-Release number of selected component (if applicable):
glusterfs 3.6.0 built on May 10 2014 13:57:11

How reproducible:
Intermittent.

Steps to Reproduce:
Create a 6x2 machine and run some intense IO on the machine.

Attached brick log.

Comment 1 Sachidananda Urs 2014-05-13 07:30:36 UTC
Created attachment 895023 [details]
Core file

Comment 3 Nagaprasad Sathyanarayana 2014-05-14 05:29:58 UTC
Can you please provide some details about the test performed.

Comment 4 Sachidananda Urs 2014-05-14 06:17:06 UTC
1. Compilebench. Which compiles the kernel.
2. fsstress - as part of ltp
3. small file creation/deletion
4. rsync huge amount of data as part of archival tests.

Comment 8 Sachidananda Urs 2014-06-16 09:42:22 UTC
Verified on: glusterfs 3.6.0.17
I no longer see the brick crash.

Comment 10 errata-xmlrpc 2014-09-22 19:37:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html


Note You need to log in before you can comment on or make changes to this bug.