This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1253303 - brick crashes cause of RDMA [NEEDINFO]
brick crashes cause of RDMA
Status: CLOSED EOL
Product: GlusterFS
Classification: Community
Component: rdma (Show other bugs)
3.7.2
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Mohammed Rafi KC
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-13 08:42 EDT by Geoffrey Letessier
Modified: 2017-03-08 06:02 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-03-08 06:02:07 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rkavunga: needinfo? (geoffrey.letessier)


Attachments (Terms of Use)
2 of my 4 storage brick logs (13.42 KB, application/x-gzip)
2015-08-13 08:42 EDT, Geoffrey Letessier
no flags Details

  None (edit)
Description Geoffrey Letessier 2015-08-13 08:42:34 EDT
Created attachment 1062515 [details]
2 of my 4 storage brick logs

Description of problem:
Sometimes a few minutes after having [re]start a volume, sometimes more, i see some bricks in a down state.

Version-Release number of selected component (if applicable):
GlusterFS 3.7.2

How reproducible:
really often

Steps to Reproduce:
1. start the volume
2. wait a moment
3. check to volume status

Actual results:
1 (or more) brick is down

Expected results:
all bricks should be UP.

Additional info:
Here is an extract of one brick log:
==
[2015-07-21 15:31:28.870310] I [MSGID: 115034] [server.c:397:_check_for_auth_option] 0-/export/brick_workdir/brick1/data: skip format check for non-addr auth option auth.login./export/brick_workdir/brick1/data.allow
[2015-07-21 15:31:28.870342] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2015-07-21 15:31:28.870367] I [MSGID: 115034] [server.c:397:_check_for_auth_option] 0-/export/brick_workdir/brick1/data: skip format check for non-addr auth option auth.login.4f1596d6-a806-4b21-9efa-c6a824b756e7.password
[2015-07-21 15:31:28.882071] I [rpcsvc.c:2213:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2015-07-21 15:31:28.882166] W [options.c:936:xl_opt_validate] 0-vol_workdir_amd-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction
pending frames:
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-07-21 15:33:21
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.2
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x3386824b76]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x33868435af]
/lib64/libc.so.6[0x3c432326a0]
/usr/lib64/glusterfs/3.7.2/rpc-transport/rdma.so(+0x67e0)[0x7ff76edb17e0]
/usr/lib64/glusterfs/3.7.2/rpc-transport/rdma.so(+0xbf7b)[0x7ff76edb6f7b]
/lib64/libpthread.so.0[0x3c436079d1]
/lib64/libc.so.6(clone+0x6d)[0x3c432e89dd]
==

In attachments you can find all my brick logs from 2 of my storage nodes.
Comment 1 Mohammed Rafi KC 2015-08-19 02:01:38 EDT
Can you give sosreport or back trace of the core generated ?
Comment 2 Kaushal 2017-03-08 06:02:07 EST
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.