Bug 764170 - (GLUSTER-2438) crash during using NFS mount
crash during using NFS mount
Status: CLOSED DUPLICATE of bug 764213
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
3.1.2
All Linux
urgent Severity high
: ---
: ---
Assigned To: Shehjar Tikoo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-16 19:02 EST by Piotr Kandziora
Modified: 2015-12-01 11:45 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: RTNR
Mount Type: nfs
Documentation: DNR
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Piotr Kandziora 2011-02-16 19:02:35 EST
After a few weeks of error-free working, GlusterFS crashed.

System: Ubuntu Lucid (2.6.32-28-generic)
Mount type: NFS (default mount parameters)
Configuration type: distribued-replica (4x2)

Content of nfs.log file:

[2011-02-16 14:37:10.491382] I [afr-common.c:819:afr_fresh_lookup_cbk] test_volume-replicate-1: added root inode
[2011-02-16 14:37:10.491852] I [afr-common.c:819:afr_fresh_lookup_cbk] test_volume-replicate-2: added root inode
[2011-02-16 14:37:10.492352] I [afr-common.c:819:afr_fresh_lookup_cbk] test_volume-replicate-3: added root inode
[2011-02-16 16:14:09.924988] I [afr-common.c:613:afr_lookup_self_heal_check] test_volume-replicate-2: size differs for /onerepo/50cab8255b4c3a6ee377baf45fc61
e7c7cab6b32
[2011-02-16 16:14:09.925093] I [afr-common.c:716:afr_lookup_done] test_volume-replicate-2: background  meta-data data self-heal triggered. path: /onerepo/50c
ab8255b4c3a6ee377baf45fc61e7c7cab6b32
[2011-02-16 16:14:10.35264] I [afr-common.c:613:afr_lookup_self_heal_check] test_volume-replicate-2: size differs for /onerepo/75103af384304fdbffadfa377690f0
344fb532a3
[2011-02-16 16:14:10.35350] I [afr-common.c:716:afr_lookup_done] test_volume-replicate-2: background  meta-data data self-heal triggered. path: /onerepo/7510
3af384304fdbffadfa377690f0344fb532a3
[2011-02-16 16:14:10.413155] I [afr-common.c:613:afr_lookup_self_heal_check] test_volume-replicate-2: size differs for /onerepo/e70ee7f945ad888426f55e6f0cdf3
9c091241603
[2011-02-16 16:14:10.413271] I [afr-common.c:716:afr_lookup_done] test_volume-replicate-2: background  meta-data data self-heal triggered. path: /onerepo/e70
ee7f945ad888426f55e6f0cdf39c091241603
[2011-02-16 16:14:10.761177] I [afr-common.c:613:afr_lookup_self_heal_check] test_volume-replicate-2: size differs for /onerepo/0eaa253ac3e9dd9b5656828a4f7e7
486d6759850
[2011-02-16 16:14:10.761288] I [afr-common.c:716:afr_lookup_done] test_volume-replicate-2: background  meta-data data self-heal triggered. path: /onerepo/0ea
a253ac3e9dd9b5656828a4f7e7486d6759850
[2011-02-16 16:14:12.354160] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] test_volume-replicate-2: background  meta-data data self-heal compl
eted on /onerepo/75103af384304fdbffadfa377690f0344fb532a3
[2011-02-16 16:14:12.797746] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] test_volume-replicate-2: background  meta-data data self-heal compl
eted on /onerepo/e70ee7f945ad888426f55e6f0cdf39c091241603
[2011-02-16 16:23:31.949740] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=0)
[2011-02-16 16:24:56.959805] E [rpcsvc.c:1693:nfs_rpcsvc_submit_generic] nfsrpc: Failed to submit message
[2011-02-16 16:24:56.959940] E [nfs3.c:522:nfs3svc_submit_reply] nfs-nfsv3: Reply submission failed
[2011-02-16 16:24:56.961438] E [rpcsvc.c:1693:nfs_rpcsvc_submit_generic] nfsrpc: Failed to submit message
[2011-02-16 16:24:56.961553] E [nfs3.c:522:nfs3svc_submit_reply] nfs-nfsv3: Reply submission failed
[2011-02-16 16:24:56.974049] E [rpcsvc.c:1693:nfs_rpcsvc_submit_generic] nfsrpc: Failed to submit message
[2011-02-16 16:24:56.974116] E [nfs3.c:522:nfs3svc_submit_reply] nfs-nfsv3: Reply submission failed
[2011-02-16 16:24:57.46931] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=0)
[2011-02-16 16:24:57.47068] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
[2011-02-16 16:24:57.47133] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
[2011-02-16 16:24:57.56965] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
[2011-02-16 16:24:57.57052] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
[2011-02-16 16:24:57.60621] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
[2011-02-16 16:24:57.60708] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
[2011-02-16 16:24:57.66769] E [event.c:722:event_select_on_epoll] epoll: index not found for fd=-1 (idx_hint=-1)
pending frames:

patchset: v3.1.1-64-gf2a067c
signal received: 11
time of crash: 2011-02-16 16:24:57
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.2
/lib/libc.so.6(+0x33af0)[0x7f002031daf0]
/usr/lib/glusterfs/3.1.2/xlator/nfs/server.so(nfs_rpcsvc_submit_vectors+0x160)[0x7f001d89ef70]
/usr/lib/glusterfs/3.1.2/xlator/nfs/server.so(nfs3svc_submit_vector_reply+0xa4)[0x7f001d885b14]
/usr/lib/glusterfs/3.1.2/xlator/nfs/server.so(nfs3_read_reply+0xf5)[0x7f001d886915]
/usr/lib/glusterfs/3.1.2/xlator/nfs/server.so(nfs3svc_read_cbk+0x8a)[0x7f001d8898aa]
/usr/lib/glusterfs/3.1.2/xlator/nfs/server.so(nfs_fop_readv_cbk+0x53)[0x7f001d87baa3]
/usr/lib/glusterfs/3.1.2/xlator/debug/io-stats.so(io_stats_readv_cbk+0x16f)[0x7f001dac4aef]
/usr/lib/glusterfs/3.1.2/xlator/performance/quick-read.so(qr_readv_cbk+0xa6)[0x7f001dccd0a6]
/usr/lib/glusterfs/3.1.2/xlator/performance/io-cache.so(ioc_frame_return+0x366)[0x7f001dedf356]
/usr/lib/glusterfs/3.1.2/xlator/performance/io-cache.so(ioc_waitq_return+0x1c)[0x7f001dedf5ac]
/usr/lib/glusterfs/3.1.2/xlator/performance/io-cache.so(ioc_fault_cbk+0x259)[0x7f001dee0a59]
/usr/lib/glusterfs/3.1.2/xlator/performance/read-ahead.so(ra_readv_disabled_cbk+0x9b)[0x7f001e0e871b]
/usr/lib/glusterfs/3.1.2/xlator/performance/write-behind.so(wb_readv_cbk+0xab)[0x7f001e2f633b]
/usr/lib/glusterfs/3.1.2/xlator/cluster/distribute.so(dht_readv_cbk+0xd3)[0x7f001e50d793]
/usr/lib/glusterfs/3.1.2/xlator/cluster/replicate.so(afr_readv_cbk+0x372)[0x7f001e73bcf2]
/usr/lib/glusterfs/3.1.2/xlator/protocol/client.so(client3_1_readv_cbk+0x367)[0x7f001e995567]
/usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f0020cb1c15]
/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0xc9)[0x7f0020cb1e69]
/usr/lib/libgfrpc.so.0(rpc_transport_notify+0x2d)[0x7f0020cad02d]
/usr/lib/glusterfs/3.1.2/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f0017b3f334]
/usr/lib/glusterfs/3.1.2/rpc-transport/socket.so(socket_event_handler+0xb3)[0x7f0017b3f403]
/usr/lib/libglusterfs.so.0(+0x38592)[0x7f0020ef1592]
/usr/sbin/glusterfs(main+0x247)[0x405597]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f0020308c4d]
/usr/sbin/glusterfs[0x4032a9]
---------
[2011-02-16 22:31:55.556079] I [nfs.c:685:init] nfs: NFS service started
[2011-02-16 22:31:55.556168] W [dict.c:1205:data_to_str] dict: @data=(nil)
[2011-02-16 22:31:55.556182] W [dict.c:1205:data_to_str] dict: @data=(nil)


In logs (dmesg) I've noticed call-trace (it occured probably while I tried copy some files from volume). Details below:

[1241404.151335] Call Trace:
[1241404.151359]  [<ffffffff810f4290>] ? sync_page+0x0/0x50
[1241404.151377]  [<ffffffff81542b73>] io_schedule+0x73/0xc0
[1241404.151390]  [<ffffffff810f42cd>] sync_page+0x3d/0x50
[1241404.151404]  [<ffffffff815432aa>] __wait_on_bit_lock+0x5a/0xc0
[1241404.151417]  [<ffffffff810f4267>] __lock_page+0x67/0x70
[1241404.151432]  [<ffffffff81084760>] ? wake_bit_function+0x0/0x40
[1241404.151445]  [<ffffffff810fe682>] ? pagevec_lookup+0x22/0x30
[1241404.151458]  [<ffffffff811000c6>] invalidate_inode_pages2_range+0x296/0x2b0
[1241404.151473]  [<ffffffff811000f7>] invalidate_inode_pages2+0x17/0x20
[1241404.151511]  [<ffffffffa022dcfb>] nfs_invalidate_mapping_nolock+0x2b/0xf0 [nfs]
[1241404.151544]  [<ffffffffa022ef27>] nfs_revalidate_mapping+0xc7/0xd0 [nfs]
[1241404.151560]  [<ffffffff81154dd9>] ? set_fd_set+0x49/0x60
[1241404.151588]  [<ffffffffa022bb87>] nfs_file_read+0x77/0x130 [nfs]
[1241404.151604]  [<ffffffff811437fa>] do_sync_read+0xfa/0x140
[1241404.151617]  [<ffffffff81084720>] ? autoremove_wake_function+0x0/0x40
[1241404.151633]  [<ffffffff8133382b>] ? put_ldisc+0x5b/0xc0
[1241404.151646]  [<ffffffff8132ded3>] ? tty_write+0x233/0x2a0
[1241404.151662]  [<ffffffff81252db6>] ? security_file_permission+0x16/0x20
[1241404.151676]  [<ffffffff81144115>] vfs_read+0xb5/0x1a0
[1241404.151688]  [<ffffffff811442d1>] sys_read+0x51/0x80
[1241404.151704]  [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
[1241524.150152] INFO: task mc:1974 blocked for more than 120 seconds.
[1241524.168217] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1241524.204990] mc            D 0000000000000000     0  1974   1729 0x00000004
[1241524.205002]  ffff880414f39b18 0000000000000082 0000000000015bc0 0000000000015bc0
[1241524.205012]  ffff880414ebdf78 ffff880414f39fd8 0000000000015bc0 ffff880414ebdbc0
[1241524.205023]  0000000000015bc0 ffff880414f39fd8 0000000000015bc0 ffff880414ebdf78

GDB output (gdb /usr/sbin/glusterfs /core):

Reading symbols from /usr/sbin/glusterfs...(no debugging symbols found)...done.
[New Thread 16475]
[New Thread 16477]
[New Thread 16476]
[New Thread 16478]

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libglusterfs.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libglusterfs.so.0
Reading symbols from /usr/lib/libgfrpc.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgfrpc.so.0
Reading symbols from /usr/lib/libgfxdr.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgfxdr.so.0
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/protocol/client.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/protocol/client.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/cluster/replicate.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/cluster/replicate.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/cluster/distribute.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/cluster/distribute.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/performance/write-behind.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/performance/write-behind.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/performance/read-ahead.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/performance/read-ahead.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/performance/io-cache.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/performance/io-cache.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/performance/quick-read.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/performance/quick-read.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/debug/io-stats.so...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/debug/io-stats.so
Reading symbols from /usr/lib/glusterfs/3.1.2/xlator/nfs/server.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/xlator/nfs/server.so
Reading symbols from /usr/lib/glusterfs/3.1.2/rpc-transport/socket.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/glusterfs/3.1.2/rpc-transport/socket.so
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Core was generated by `/usr/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/ru'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f001d89ef70 in nfs_rpcsvc_submit_vectors () from /usr/lib/glusterfs/3.1.2/xlator/nfs/server.so
Comment 1 Shehjar Tikoo 2011-03-11 02:08:38 EST
Patches that fix this bug are available as part of bugs 2481 and 2504. They'll be part of release 3.1.3. Thanks for reporting.

*** This bug has been marked as a duplicate of bug 2481 ***

Note You need to log in before you can comment on or make changes to this bug.