| Summary: | client crashes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Benjamin Henrion <bh> | ||||
| Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3.1.2 | CC: | bh, gluster-bugs, rabhat, vijay, visonge | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | --- | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Benjamin Henrion
2011-03-08 07:09:07 UTC
The client crashes while I shutting down some servers. The mounted directory over /glusterfs is unresponsive, and reports: root@machine / [92]# cd /glusterfs -bash: cd: /glusterfs: Transport endpoint is not connected root@machine / [93]# l ls: cannot access glusterfs: Transport endpoint is not connected See attached for the full log. Steps to reproduce and the back trace with symbols:
mount a replicated volume.
start a dd for a file
bring one of the bricks down
remove the file from the mount point while the dd is still in progress
bring the brick backup
the next write call triggers an openfd self heal but the loc will be all zeros so loc_copy will crash the client
#0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31
#1 0x00007f8312b8e015 in gf_strdup (src=0x0) at ../../../libglusterfs/src/mem-pool.h:89
#2 0x00007f8312b9187c in loc_copy (dst=0x7f83080024d0, src=0x7f83080008e8) at ../../../libglusterfs/src/xlator.c:1055
#3 0x00007f830f539bb8 in afr_build_parent_loc (parent=0x7f83080024d0, child=0x7f83080008e8) at ../../../../../xlators/cluster/afr/src/afr-dir-write.c:57
#4 0x00007f830f55fd45 in afr_self_heal_missing_entries (frame=0x7f83112fd94c, this=0x665200) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1419
#5 0x00007f830f5607cb in afr_self_heal (frame=0x7f8311526e10, this=0x665200) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1629
#6 0x00007f830f55177b in afr_openfd_sh (frame=0x7f8311526e10, this=0x665200) at ../../../../../xlators/cluster/afr/src/afr-open.c:437
#7 0x00007f830f556d4f in afr_internal_lock_finish (frame=0x7f8311526e10, this=0x665200) at ../../../../../xlators/cluster/afr/src/afr-transaction.c:1169
#8 0x00007f830f55657b in afr_post_blocking_inodelk_cbk (frame=0x7f8311526e10, this=0x665200) at ../../../../../xlators/cluster/afr/src/afr-transaction.c:954
#9 0x00007f830f57441b in afr_lock_blocking (frame=0x7f8311526e10, this=0x665200, child_index=2) at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:992
#10 0x00007f830f573932 in afr_lock_cbk (frame=0x7f8311526e10, cookie=0x1, this=0x665200, op_ret=-1, op_errno=77)
at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:756
#11 0x00007f830f5739a5 in afr_blocking_inodelk_cbk (frame=0x7f8311526e10, cookie=0x1, this=0x665200, op_ret=-1, op_errno=77)
at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:770
#12 0x00007f830f7bfb49 in client3_1_finodelk (frame=0x7f83115270a4, this=0x664600, data=0x7fffb1ca1110)
at ../../../../../xlators/protocol/client/src/client3_1-fops.c:4529
#13 0x00007f830f7a94c2 in client_finodelk (frame=0x7f83115270a4, this=0x664600, volume=0x664400 "vol-replicate-0", fd=0x7f830dd8d024, cmd=7, lock=0x7fffb1ca1260)
at ../../../../../xlators/protocol/client/src/client.c:1290
#14 0x00007f830f574701 in afr_lock_blocking (frame=0x7f8311526e10, this=0x665200, child_index=1) at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:1005
#15 0x00007f830f573932 in afr_lock_cbk (frame=0x7f8311526e10, cookie=0x0, this=0x665200, op_ret=0, op_errno=0)
at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:756
#16 0x00007f830f5739a5 in afr_blocking_inodelk_cbk (frame=0x7f8311526e10, cookie=0x0, this=0x665200, op_ret=0, op_errno=0)
at ../../../../../xlators/cluster/afr/src/afr-lk-common.c:770
#17 0x00007f830f7b2458 in client3_1_finodelk_cbk (req=0x7f830e42d4ac, iov=0x7f830e42d4ec, count=1, myframe=0x7f83115268e8)
at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1101
#18 0x00007f831296a51d in rpc_clnt_handle_reply (clnt=0x66e3f0, pollin=0x7f8308006170) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:757
#19 0x00007f831296a87c in rpc_clnt_notify (trans=0x66e510, mydata=0x66e420, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f8308006170)
at ../../../../rpc/rpc-lib/src/rpc-clnt.c:870
#20 0x00007f8312967c75 in rpc_transport_notify (this=0x66e510, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f8308006170)
at ../../../../rpc/rpc-lib/src/rpc-transport.c:1027
#21 0x00007f83103e7e44 in socket_event_poll_in (this=0x66e510) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1623
#22 0x00007f83103e81f7 in socket_event_handler (fd=7, idx=1, data=0x66e510, poll_in=1, poll_out=0, poll_err=0)
at ../../../../../rpc/rpc-transport/socket/src/socket.c:1737
#23 0x00007f8312bbc333 in event_dispatch_epoll_handler (event_pool=0x659330, events=0x65d9a0, i=0) at ../../../libglusterfs/src/event.c:812
#24 0x00007f8312bbc543 in event_dispatch_epoll (event_pool=0x659330) at ../../../libglusterfs/src/event.c:876
#25 0x00007f8312bbc8ab in event_dispatch (event_pool=0x659330) at ../../../libglusterfs/src/event.c:984
#26 0x00000000004067e8 in main (argc=5, argv=0x7fffb1ca1998) at ../../../glusterfsd/src/glusterfsd.c:1442
PATCH: http://patches.gluster.com/patch/6400 in master (cluster/afr: skip openfd flush when the file is already deleted) PATCH: http://patches.gluster.com/patch/6611 in master (features/marker: check for op_ret before doing any operations in lookup callback) PATCH: http://patches.gluster.com/patch/6546 in release-3.1 (cluster/afr: skip openfd flush when the file is already deleted) Tried with the steps mentioned by Pranith. client crashed for 3.1.3. With master it did not crash. |