Bug 1556895 - [RHHI]Fuse mount crashed with only one VM running with its image on that volume
Summary: [RHHI]Fuse mount crashed with only one VM running with its image on that volume
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: sharding
Version: rhhiv-1.5
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.4.0
Assignee: Pranith Kumar K
QA Contact: SATHEESARAN
URL:
Whiteboard:
: 1585044 (view as bug list)
Depends On:
Blocks: 1503137 1556891 1559831 1585044 1585046
TreeView+ depends on / blocked
 
Reported: 2018-03-15 13:08 UTC by SATHEESARAN
Modified: 2018-09-04 06:45 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.12.2-6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1556891
: 1557876 1585044 1585046 (view as bug list)
Environment:
Last Closed: 2018-09-04 06:44:14 UTC
Embargoed:


Attachments (Terms of Use)
fuse-mount-log (132.94 KB, text/plain)
2018-03-15 14:51 UTC, SATHEESARAN
no flags Details
sosreport (15.90 MB, application/x-xz)
2018-03-15 15:13 UTC, SATHEESARAN
no flags Details
coredump-file (1.06 MB, application/x-gzip)
2018-03-15 15:14 UTC, SATHEESARAN
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 0 None None None 2018-09-04 06:45:42 UTC

Description SATHEESARAN 2018-03-15 13:08:46 UTC
+++ This bug was initially created as a clone of Bug #1556891 +++

Description of problem:
-----------------------
With single node RHHI installation, engine volume is the only one VM running with its image on the gluster volume. After sometime, the fuse mount crashed

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHV 4.2.2-4
RHGS 3.4.0 - glusterfs-3.12.2-5.el7rhgs ( interim build )

How reproducible:
-----------------
Hit it once

Steps to Reproduce:
-------------------
1. Create distribute volume ( 1x1 ) and fuse mount it
2. Engine VM is running with its image on gluster volume

Actual results:
---------------
Observed fuse mount crash

Expected results:
-----------------
Fuse mount should not crash

Comment 1 Nithya Balachandran 2018-03-15 14:06:20 UTC
Is there a coredump or backtrace for this crash? Do they symbols indicate that the crash was in dht?

Comment 2 SATHEESARAN 2018-03-15 14:45:50 UTC
Engine VM is running on the plain distribute volume with sharding enabled with shard block-size set to 64MB

1. Cluster info
----------------
There is only one node in the cluster

2. Volume info
---------------
[root@ ]# gluster volume info engine
 
Volume Name: engine
Type: Distribute
Volume ID: 17806a7c-64fb-4a9f-a313-f4e99df6231c
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.70.36.244:/gluster_bricks/engine/engine
Options Reconfigured:
auth.ssl-allow: rhsqa-grafton10.lab.eng.blr.redhat.com,rhsqa-grafton11.lab.eng.blr.redhat.com,rhsqa-grafton12.lab.eng.blr.redhat.com
client.ssl: on
server.ssl: on
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
user.cifs: off
network.ping-timeout: 30
network.remote-dio: off
performance.strict-o-direct: on
performance.low-prio-threads: 32
features.shard: on
storage.owner-gid: 36
storage.owner-uid: 36
transport.address-family: inet
nfs.disable: on

[root@ ]# gluster volume status engine
Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.36.244:/gluster_bricks/engine/e
ngine                                       49152     0          Y       49791
 
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks

3. Other information
---------------------
1. Gluster encryption is enabled on management and data path
2. Sharding is enabled on this volume with shard-block-size is 64MB

Comment 3 SATHEESARAN 2018-03-15 14:47:29 UTC
(In reply to Nithya Balachandran from comment #1)
> Is there a coredump or backtrace for this crash? Do they symbols indicate
> that the crash was in dht?

Nithya, 

You are too quick to respond before I could provide all the detail.
Thanks for that follow-up. I will update the required info one by one

Comment 4 SATHEESARAN 2018-03-15 14:50:47 UTC
[2018-03-14 18:37:46.880095] I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/bbd1fada-3cf8-42ba-8440-9a93990c37d9.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/bbd1fada-3cf8-42ba-8440-9a93990c37d9 (hash=engine-client-0/cache=<nul>)
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2 (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2.backup (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:35.920700] and [2018-03-14 18:37:36.339496]
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/591f5eb6-8451-48c9-8654-2ab3b69a2fb2 (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:35.923482] and [2018-03-14 18:37:36.342288]
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229 (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229.backup (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:36.162693] and [2018-03-14 18:37:36.575512]
The message "I [MSGID: 109066] [dht-rename.c:1741:dht_rename] 0-engine-dht: renaming /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229.temp (hash=engine-client-0/cache=engine-client-0) => /6e464c6f-1f1a-45b7-a7a7-8faf7a88e155/master/tasks/02dc32cb-7692-453a-bd12-be617103c229 (hash=engine-client-0/cache=<nul>)" repeated 9 times between [2018-03-14 18:37:36.165706] and [2018-03-14 18:37:36.578852]
pending frames:
frame : type(1) op(FSYNC)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2018-03-15 01:31:32
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.2
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f496c6163f0]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f496c620334]
/lib64/libc.so.6(+0x36280)[0x7f496ac75280]
[0x7f494402b148]

Comment 5 SATHEESARAN 2018-03-15 14:51:42 UTC
Created attachment 1408453 [details]
fuse-mount-log

Comment 6 SATHEESARAN 2018-03-15 15:13:36 UTC
Created attachment 1408470 [details]
sosreport

Comment 7 SATHEESARAN 2018-03-15 15:14:10 UTC
Created attachment 1408471 [details]
coredump-file

Comment 8 SATHEESARAN 2018-03-15 15:20:36 UTC
The other information which was missed in comment2 is that the brick is created over VDO volume

Comment 18 SATHEESARAN 2018-04-29 10:35:34 UTC
Tested with RHV 4.2 & RHGS 3.4.0 nightly (3.12.2-8)

This issue is not seen with the steps in comment0

Comment 19 Sunil Kumar Acharya 2018-06-12 09:18:36 UTC
*** Bug 1585044 has been marked as a duplicate of this bug. ***

Comment 21 errata-xmlrpc 2018-09-04 06:44:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.