Bug 1298162
Summary: | fuse mount crashed with mount point inaccessible and core found | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> |
Component: | disperse | Assignee: | Pranith Kumar K <pkarampu> |
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.1 | CC: | asrivast, pkarampu, rcyriac, rhinduja, rhs-bugs, sankarshan, smohan, storage-qa-internal |
Target Milestone: | --- | Keywords: | ZStream |
Target Release: | RHGS 3.1.3 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.7.9-1 | Doc Type: | Bug Fix |
Doc Text: |
A race condition between the gf_timer_call_cancel() and gf_timer_proc() calls in the disperse functionality that sometimes led to a crash. This update corrects the race condition so that these calls occur in the correct order and the crash is avoided.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-06-23 05:02:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1299184 |
Description
Nag Pavan Chilakam
2016-01-13 11:39:23 UTC
sosreports [nchilaka@rhsqe-repo bug.1298162]$ pwd /home/repo/sosreports/nchilaka/bug.1298162 [nchilaka@rhsqe-repo bug.1298162]$ hostname rhsqe-repo.lab.eng.blr.redhat.com Found another crash on another client. But core was not collected. Saw the following in fuse mount log [root@rhsauto070 glusterfs]# tail mnt-testvol.log spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.5 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f6a26cb3002] /lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f6a26ccf47d] /lib64/libc.so.6(+0x35670)[0x7f6a253a1670] /lib64/libpthread.so.0(pthread_spin_lock+0x0)[0x7f6a25b20210] --------- [root@rhsauto070 glusterfs]# tail -n 50 mnt-testvol.log [2016-01-13 02:01:57.008411] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(+0x11c41) [0x7f6a24e19c41] -->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_replies_interpret+0x15e) [0x7f6a24e47e9e] -->/lib64/libglusterfs.so.0(dict_get+0xac) [0x7f6a26cab0cc] ) 2-dict: !this || key=glusterfs.bad-inode [Invalid argument] [2016-01-13 02:01:57.008451] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_replies_interpret+0x1ad) [0x7f6a24e47eed] -->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_accuse_smallfiles+0x66) [0x7f6a24e47cb6] -->/lib64/libglusterfs.so.0(dict_get+0xac) [0x7f6a26cab0cc] ) 2-dict: !this || key=glusterfs.bad-inode [Invalid argument] [2016-01-13 02:01:57.008512] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_replies_interpret+0x1ad) [0x7f6a24e47eed] -->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_accuse_smallfiles+0x66) [0x7f6a24e47cb6] -->/lib64/libglusterfs.so.0(dict_get+0xac) [0x7f6a26cab0cc] ) 2-dict: !this || key=glusterfs.bad-inode [Invalid argument] [2016-01-13 02:01:57.008663] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_replies_interpret+0x1ad) [0x7f6a24e47eed] -->/usr/lib64/glusterfs/3.7.5/xlator/cluster/replicate.so(afr_accuse_smallfiles+0x66) [0x7f6a24e47cb6] -->/lib64/libglusterfs.so.0(dict_get+0xac) [0x7f6a26cab0cc] ) 2-dict: !this || key=glusterfs.bad-inode [Invalid argument] [2016-01-13 02:01:57.020032] I [MSGID: 109036] [dht-common.c:7950:dht_log_new_layout_for_dir_selfheal] 2-testvol-hot-dht: Setting layout of /kernel.leg/dir.45/linux-4.3.3/drivers/phy with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop: 1431655764 , Hash: 1 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start: 1431655765 , Stop: 2863311529 , Hash: 1 ], [Subvol_name: testvol-replicate-2, Err: -1 , Start: 2863311530 , Stop: 4294967295 , Hash: 1 ], [2016-01-13 02:01:57.034867] I [MSGID: 109036] [dht-common.c:7950:dht_log_new_layout_for_dir_selfheal] 2-testvol-tier-dht: Setting layout of /kernel.leg/dir.45/linux-4.3.3/drivers/phy with [Subvol_name: testvol-cold-dht, Err: 28 , Start: 0 , Stop: 0 , Hash: 0 ], [Subvol_name: testvol-hot-dht, Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ], [2016-01-13 02:02:05.037947] E [MSGID: 122027] [ec-data.c:29:ec_cbk_data_allocate] 2-testvol-client-6: Mismatching xlators between request and answer (req=testvol-disperse-1, ans=testvol-client-6). [Invalid argument] The message "E [MSGID: 122027] [ec-data.c:29:ec_cbk_data_allocate] 2-testvol-client-6: Mismatching xlators between request and answer (req=testvol-disperse-1, ans=testvol-client-6). [Invalid argument]" repeated 4 times between [2016-01-13 02:02:05.037947] and [2016-01-13 02:02:05.039945] [2016-01-13 02:02:05.040251] E [MSGID: 122027] [ec-data.c:29:ec_cbk_data_allocate] 2-testvol-client-6: Mismatching xlators between request and answer (req=testvol-disperse-1, ans=testvol-client-6). [Invalid argument] pending frames: frame : type(1) op(STAT) frame : type(1) op(STAT) frame : type(1) op(STAT) frame : type(1) op(STAT) frame : type(1) op(CREATE) frame : type(1) op(CREATE) frame : type(1) op(CREATE) frame : type(1) op(CREATE) pending frames: frame : type(1) op(STAT) frame : type(1) op(STAT) frame : type(1) op(STAT) frame : type(1) op(STAT) frame : type(1) op(CREATE) frame : type(1) op(CREATE) frame : type(1) op(CREATE) frame : type(1) op(CREATE) frame : type(1) op(OPEN) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2016-01-13 02:02:05 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.5 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f6a26cb3002] /lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f6a26ccf47d] /lib64/libc.so.6(+0x35670)[0x7f6a253a1670] /lib64/libpthread.so.0(pthread_spin_lock+0x0)[0x7f6a25b20210] --------- 10.70.35.190:testvol on /mnt/testvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime) 10.70.35.190:testdb on /mnt/tesdb type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime) [root@rhsauto070 glusterfs]# cd /mnt/testvol -bash: cd: /mnt/testvol: Transport endpoint is not connected [root@rhsauto070 glusterfs]# This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions I ran the steps mentioned above for about 4 days, and didnt see any issue. hence closing the bug as fixed test version:3.7.9-3 Looks good to me Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240 |