| Summary: | [3.0.4rc2] Crash in afr_up_down_flush_post_post_op | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Anush Shetty <anush> |
| Component: | replicate | Assignee: | Pavan Vilas Sondur <pavan> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | low | ||
| Version: | mainline | CC: | aavati, amarts, gluster-bugs, rabhat, vijay |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Here we need to handle the case of 'inode_path()' returning NULL. That should fix the issue. With 3.1.x releases will have proper 'NULL' pointer check and should not result in such crashes. PATCH: http://patches.gluster.com/patch/3305 in release-3.0 (cluster/afr: Handle open-fds of unlinked files during a possible self heal gracefully.) checked with glusterfs-3.0.5rc6. dd 1G file rm -rf * in a loop on the mount point did server up down several times(using while true ; ). It did not crashed. 3.0.4 and 3.0.4rc2 crasched with above procedure. |
On a replicate setup with 2 servers, I tried running Screen1: dd 1G write Screen2: rm -rf * in a loop Screen3: killed server1, brought server1 up There was a crash (gdb) bt #0 strrchr () at ../sysdeps/x86_64/strrchr.S:33 #1 0x00007fa36d549832 in afr_up_down_flush_post_post_op (frame=0x7fa3652919f0, this=0x179bd10) at afr-open.c:362 #2 0x00007fa36d54bd48 in afr_changelog_post_op_cbk (frame=0x7fa3652919f0, cookie=0x7fa360732750, this=0x179bd10, op_ret=0, op_errno=22, xattr=0x7fa3607895b0) at afr-transaction.c:697 #3 0x00007fa36d7924a7 in client_fxattrop_cbk (frame=0x7fa360732750, hdr=0x7fa3607a0d60, hdrlen=197, iobuf=0x0) at client-protocol.c:3857 #4 0x00007fa36d79a6d6 in protocol_client_interpret (this=0x179b7f0, trans=0x179f260, hdr_p=0x7fa3607a0d60 "", hdrlen=197, iobuf=0x0) at client-protocol.c:6529 #5 0x00007fa36d79b457 in protocol_client_pollin (this=0x179b7f0, trans=0x179f260) at client-protocol.c:6827 #6 0x00007fa36d79ba50 in notify (this=0x179b7f0, event=2, data=0x179f260) at client-protocol.c:6946 #7 0x00007fa36e94dd2a in xlator_notify (xl=0x179b7f0, event=2, data=0x179f260) at xlator.c:924 #8 0x00007fa36c2bd45a in socket_event_poll_in (this=0x179f260) at socket.c:731 #9 0x00007fa36c2bd78d in socket_event_handler (fd=11, idx=0, data=0x179f260, poll_in=1, poll_out=0, poll_err=0) at socket.c:831 #10 0x00007fa36e9739ee in event_dispatch_epoll_handler (event_pool=0x1795320, events=0x17a2560, i=1) at event.c:804 #11 0x00007fa36e973be0 in event_dispatch_epoll (event_pool=0x1795320) at event.c:867 #12 0x00007fa36e973eff in event_dispatch (event_pool=0x1795320) at event.c:975 #13 0x0000000000406869 in main (argc=8, argv=0x7fffcf2dea28) at glusterfsd.c:1413