Bug 1556895 - [RHHI]Fuse mount crashed with only one VM running with its image on that volume
Summary: [RHHI]Fuse mount crashed with only one VM running with its image on that volume
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: sharding
Version: rhhiv-1.5
Hardware: x86_64
OS: Linux
Target Milestone: ---
: RHGS 3.4.0
Assignee: Pranith Kumar K
: 1585044 (view as bug list)
Depends On:
Blocks: 1503137 1556891 1559831 1585044 1585046
TreeView+ depends on / blocked
Reported: 2018-03-15 13:08 UTC by SATHEESARAN
Modified: 2018-09-04 06:45 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.12.2-6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1556891
: 1557876 1585044 1585046 (view as bug list)
Last Closed: 2018-09-04 06:44:14 UTC
Target Upstream Version:

Attachments (Terms of Use)
fuse-mount-log (132.94 KB, text/plain)
2018-03-15 14:51 UTC, SATHEESARAN
no flags Details
sosreport (15.90 MB, application/x-xz)
2018-03-15 15:13 UTC, SATHEESARAN
no flags Details
coredump-file (1.06 MB, application/x-gzip)
2018-03-15 15:14 UTC, SATHEESARAN
no flags Details

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 06:45:42 UTC

Description SATHEESARAN 2018-03-15 13:08:46 UTC
+++ This bug was initially created as a clone of Bug #1556891 +++

Description of problem:
With single node RHHI installation, engine volume is the only one VM running with its image on the gluster volume. After sometime, the fuse mount crashed

Version-Release number of selected component (if applicable):
RHV 4.2.2-4
RHGS 3.4.0 - glusterfs-3.12.2-5.el7rhgs ( interim build )

How reproducible:
Hit it once

Steps to Reproduce:
1. Create distribute volume ( 1x1 ) and fuse mount it
2. Engine VM is running with its image on gluster volume

Actual results:
Observed fuse mount crash

Expected results:
Fuse mount should not crash

Comment 1 Nithya Balachandran 2018-03-15 14:06:20 UTC
Is there a coredump or backtrace for this crash? Do they symbols indicate that the crash was in dht?

Comment 2 SATHEESARAN 2018-03-15 14:45:50 UTC
Engine VM is running on the plain distribute volume with sharding enabled with shard block-size set to 64MB

1. Cluster info
There is only one node in the cluster

2. Volume info
[root@ ]# gluster volume info engine
Volume Name: engine
Type: Distribute
Volume ID: 17806a7c-64fb-4a9f-a313-f4e99df6231c
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Options Reconfigured:
auth.ssl-allow: rhsqa-grafton10.lab.eng.blr.redhat.com,rhsqa-grafton11.lab.eng.blr.redhat.com,rhsqa-grafton12.lab.eng.blr.redhat.com
client.ssl: on
server.ssl: on
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
user.cifs: off
network.ping-timeout: 30
network.remote-dio: off
performance.strict-o-direct: on
performance.low-prio-threads: 32
features.shard: on
storage.owner-gid: 36
storage.owner-uid: 36
transport.address-family: inet
nfs.disable: on

[root@ ]# gluster volume status engine
Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
ngine                                       49152     0          Y       49791
Task Status of Volume engine
There are no active volume tasks

3. Other information
1. Gluster encryption is enabled on management and data path
2. Sharding is enabled on this volume with shard-block-size is 64MB

Comment 3 SATHEESARAN 2018-03-15 14:47:29 UTC
(In reply to Nithya Balachandran from comment #1)
> Is there a coredump or backtrace for this crash? Do they symbols indicate
> that the crash was in dht?


You are too quick to respond before I could provide all the detail.
Thanks for that follow-up. I will update the required info one by one

Comment 4 SATHEESARAN 2018-03-15 14:50:47 UTC
[2018-03-14 18:37:46.880095] I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/bbd1fada-3cf8-42ba-8440-9a93990c37d9.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/bbd1fada-3cf8-42ba-8440-9a93990c37d9 (hash=engine-client-0/cache=<nul>)
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2 (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2.backup (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:35.920700] and [2018-03-14 18:37:36.339496]
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2 (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:35.923482] and [2018-03-14 18:37:36.342288]
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229 (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229.backup (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:36.162693] and [2018-03-14 18:37:36.575512]
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229 (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:36.165706] and [2018-03-14 18:37:36.578852]
pending frames:
frame : type(1) op(FSYNC)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2018-03-15 01:31:32
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.2

Comment 5 SATHEESARAN 2018-03-15 14:51:42 UTC
Created attachment 1408453 [details]

Comment 6 SATHEESARAN 2018-03-15 15:13:36 UTC
Created attachment 1408470 [details]

Comment 7 SATHEESARAN 2018-03-15 15:14:10 UTC
Created attachment 1408471 [details]

Comment 8 SATHEESARAN 2018-03-15 15:20:36 UTC
The other information which was missed in comment2 is that the brick is created over VDO volume

Comment 18 SATHEESARAN 2018-04-29 10:35:34 UTC
Tested with RHV 4.2 & RHGS 3.4.0 nightly (3.12.2-8)

This issue is not seen with the steps in comment0

Comment 19 Sunil Kumar Acharya 2018-06-12 09:18:36 UTC
*** Bug 1585044 has been marked as a duplicate of this bug. ***

Comment 21 errata-xmlrpc 2018-09-04 06:44:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.