+++ This bug was initially created as a clone of Bug #1556891 +++ Description of problem: ----------------------- With single node RHHI installation, engine volume is the only one VM running with its image on the gluster volume. After sometime, the fuse mount crashed Version-Release number of selected component (if applicable): ------------------------------------------------------------- RHV 4.2.2-4 RHGS 3.4.0 - glusterfs-3.12.2-5.el7rhgs ( interim build ) How reproducible: ----------------- Hit it once Steps to Reproduce: ------------------- 1. Create distribute volume ( 1x1 ) and fuse mount it 2. Engine VM is running with its image on gluster volume Actual results: --------------- Observed fuse mount crash Expected results: ----------------- Fuse mount should not crash
Is there a coredump or backtrace for this crash? Do they symbols indicate that the crash was in dht?
Engine VM is running on the plain distribute volume with sharding enabled with shard block-size set to 64MB 1. Cluster info ---------------- There is only one node in the cluster 2. Volume info --------------- [root@ ]# gluster volume info engine Volume Name: engine Type: Distribute Volume ID: 17806a7c-64fb-4a9f-a313-f4e99df6231c Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.70.36.244:/gluster_bricks/engine/engine Options Reconfigured: auth.ssl-allow: rhsqa-grafton10.lab.eng.blr.redhat.com,rhsqa-grafton11.lab.eng.blr.redhat.com,rhsqa-grafton12.lab.eng.blr.redhat.com client.ssl: on server.ssl: on cluster.eager-lock: enable performance.io-cache: off performance.read-ahead: off performance.quick-read: off user.cifs: off network.ping-timeout: 30 network.remote-dio: off performance.strict-o-direct: on performance.low-prio-threads: 32 features.shard: on storage.owner-gid: 36 storage.owner-uid: 36 transport.address-family: inet nfs.disable: on [root@ ]# gluster volume status engine Status of volume: engine Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.36.244:/gluster_bricks/engine/e ngine 49152 0 Y 49791 Task Status of Volume engine ------------------------------------------------------------------------------ There are no active volume tasks 3. Other information --------------------- 1. Gluster encryption is enabled on management and data path 2. Sharding is enabled on this volume with shard-block-size is 64MB
(In reply to Nithya Balachandran from comment #1) > Is there a coredump or backtrace for this crash? Do they symbols indicate > that the crash was in dht? Nithya, You are too quick to respond before I could provide all the detail. Thanks for that follow-up. I will update the required info one by one
[2018-03-14 18:37:46.880095] I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/bbd1fada-3cf8-42ba-8440-9a93990c37d9.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/bbd1fada-3cf8-42ba-8440-9a93990c37d9 (hash=engine-client-0/cache=<nul>) The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2 (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2.backup (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:35.920700] and [2018-03-14 18:37:36.339496] The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2 (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:35.923482] and [2018-03-14 18:37:36.342288] The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229 (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229.backup (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:36.162693] and [2018-03-14 18:37:36.575512] The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229 (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:36.165706] and [2018-03-14 18:37:36.578852] pending frames: frame : type(1) op(FSYNC) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2018-03-15 01:31:32 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f496c6163f0] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f496c620334] /lib64/libc.so.6(+0x36280)[0x7f496ac75280] [0x7f494402b148]
Created attachment 1408453 [details] fuse-mount-log
Created attachment 1408470 [details] sosreport
Created attachment 1408471 [details] coredump-file
The other information which was missed in comment2 is that the brick is created over VDO volume
Tested with RHV 4.2 & RHGS 3.4.0 nightly (3.12.2-8) This issue is not seen with the steps in comment0
*** Bug 1585044 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607