Bug 2128703
Summary: | [GSS] VDSM Problem while trying to mount target | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rafrojas <rafrojas> |
Component: | core | Assignee: | Mohit Agrawal <moagrawa> |
Status: | CLOSED DUPLICATE | QA Contact: | |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | rhgs-3.4 | CC: | moagrawa, rhs-bugs, sajmoham |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-09-23 05:12:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Rafrojas
2022-09-21 12:51:43 UTC
Hi, As per logs, it seems it is a known issue, most probably the issue should be similar to the bug (https://bugzilla.redhat.com/show_bug.cgi?id=1917488). I can confirm more after checking the coredump while setup will be available. [2022-09-20 11:33:05.591614] E [MSGID: 133010] [shard.c:2299:shard_common_lookup_shards_cbk] 0-data-shard: Lookup on shard 1729 failed. Base file gfid = 98f326c2-6a81-48c1-81e5-d93b41edb543 [Stale file handle] pending frames: frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2022-09-20 11:33:05 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7f6df6b11bdd] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f6df6b1c154] /lib64/libc.so.6(+0x363f0)[0x7f6df514b3f0] /lib64/libuuid.so.1(+0x2570)[0x7f6df6272570] /lib64/libuuid.so.1(+0x2606)[0x7f6df6272606] /lib64/libglusterfs.so.0(uuid_utoa+0x1c)[0x7f6df6b1b2ec] It is a known issue and we have already backported the patch in a downstream release(6.0.57). A fuse process is crashing due to a bug in write-behind while truncating a file. The patch is merged in the downstream build(glusterfs-fuse-6.0-57 from the bug https://bugzilla.redhat.com/show_bug.cgi?id=1917488). Either the user has to upgrade to the latest downstream release or we can suggest disabling write-behind to avoid the crash. Thanks, Mohit Agrawal Hi, Thanks for sharing the environment to debug a core. The client process is getting crashed because a shard xlator is trying to access inode that is already unlinked while shard is trying to reattempt cleanup during remount.It is a known issue and the issue is already fixed in the release glusterfs-6.0.35(https://bugzilla.redhat.com/show_bug.cgi?id=1836233). gdb) bt #0 0x00007f916ca2e570 in uuid_unpack () from /lib64/libuuid.so.1 #1 0x00007f916ca2e606 in uuid_unparse_x () from /lib64/libuuid.so.1 #2 0x00007f916d2d72ec in gf_uuid_unparse (out=0x7f9130006cd0 "98f326c2-6a81-48c1-81e5-d93b41edb543", uuid=0x8 <Address 0x8 out of bounds>) at compat-uuid.h:57 #3 uuid_utoa (uuid=0x8 <Address 0x8 out of bounds>) at common-utils.c:2852 #4 0x00007f915e805596 in shard_post_lookup_shards_unlink_handler (frame=<optimized out>, this=0x7f915801e8d0) at shard.c:2915 #5 0x00007f915e803fa5 in shard_common_lookup_shards (frame=frame@entry=0x7f914801b598, this=this@entry=0x7f915801e8d0, inode=<optimized out>, handler=handler@entry=0x7f915e805540 <shard_post_lookup_shards_unlink_handler>) at shard.c:2458 #6 0x00007f915e80561c in shard_post_resolve_unlink_handler (frame=frame@entry=0x7f914801b598, this=this@entry=0x7f915801e8d0) at shard.c:2939 #7 0x00007f915e801b47 in shard_common_resolve_shards (frame=frame@entry=0x7f914801b598, this=this@entry=0x7f915801e8d0, post_res_handler=post_res_handler@entry=0x7f915e8055f0 <shard_post_resolve_unlink_handler>) at shard.c:1069 #8 0x00007f915e805721 in shard_regulated_shards_deletion (cleanup_frame=cleanup_frame@entry=0x7f914801b598, this=this@entry=0x7f915801e8d0, now=now@entry=100, first_block=first_block@entry=1701, entry=entry@entry=0x7f914c021c30) at shard.c:3178 #9 0x00007f915e805d84 in __shard_delete_shards_of_entry (cleanup_frame=cleanup_frame@entry=0x7f914801b598, this=this@entry=0x7f915801e8d0, entry=entry@entry=0x7f914c021c30, inode=inode@entry=0x7f914c00f888) at shard.c:3339 #10 0x00007f915e806196 in shard_delete_shards_of_entry (cleanup_frame=cleanup_frame@entry=0x7f914801b598, this=this@entry=0x7f915801e8d0, entry=entry@entry=0x7f914c021c30, inode=inode@entry=0x7f914c00f888) at shard.c:3395 #11 0x00007f915e80687f in shard_delete_shards (opaque=0x7f914801b598) at shard.c:3619 #12 0x00007f916d307840 in synctask_wrap () at syncop.c:375 #13 0x00007f916b919180 in ?? () from /lib64/libc.so.6 #14 0x0000000000000000 in ?? () (gdb) f 4 #4 0x00007f915e805596 in shard_post_lookup_shards_unlink_handler (frame=<optimized out>, this=0x7f915801e8d0) at shard.c:2915 2915 gf_msg (this->name, GF_LOG_ERROR, local->op_errno, (gdb) l 2910 shard_local_t *local = NULL; 2911 2912 local = frame->local; 2913 2914 if ((local->op_ret < 0) && (local->op_errno != ENOENT)) { 2915 gf_msg (this->name, GF_LOG_ERROR, local->op_errno, 2916 SHARD_MSG_FOP_FAILED, "failed to delete shards of %s", 2917 uuid_utoa (local->resolver_base_inode->gfid)); 2918 return 0; 2919 } (gdb) p local->resolver_base_inode $2 = (inode_t *) 0x0 (gdb) p local->resolver_base_inode->gfid Cannot access memory at address 0x8 Can we ask to upgrade the environment to avoid a crash. The earlier suggested workaround will not work in this case though they were facing that issue also because traceback was captured in logs but coredump is not available so We should suggest upgrading the environment after the release (6.0.57) to avoid both issues. Thanks, Mohit Agrawal *** This bug has been marked as a duplicate of bug 1836233 *** |