Bug 1724891 - [RHHI-V] glusterd crashes after upgrade and unable to start it again
Summary: [RHHI-V] glusterd crashes after upgrade and unable to start it again
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhhi
Version: rhhiv-1.7
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: RHHI-V 1.7
Assignee: Sahina Bose
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On: 1724885
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-28 03:21 UTC by SATHEESARAN
Modified: 2020-02-13 15:57 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1724885
Environment:
Last Closed: 2020-02-13 15:57:23 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0508 0 None None None 2020-02-13 15:57:37 UTC

Description SATHEESARAN 2019-06-28 03:21:00 UTC
Description of problem:
-----------------------
RHHI-V 1.6 async uses glusterfs-3.12.2-47.el7rhgs + RHEL 7.6 + RHVH 4.3.3 async2
When upgrading to RHVH 4.3.5 ( with RHEL 7.7 based RHVH ), glusterd crashed on reboot of the host and denies to start from thereon

Brief update on the upgrade procedure for clarity
1. RHVH node is nothing but the strimmed version of RHEL
2. Upgrade in RHVH happens via image update, and reboot happens after upgrade automatically
3. Latest image doesn't contain glusterfs-6.0-6, so image is first updated and rebooted, then glusterfs packages are updated from glusterfs-3.12.2-47.2 to glusterfs-6.0-6. Note that earlier glusterfs package was glusterfs-3.12.2-47 then upgraded to glusterfs-3.12.2-47.2, then upgraded to glusterfs-6.0-6. No op-version changes happened so far.

Version-Release number of selected component (if applicable):
---------------------------------------------------------------
RHVH 4.3.5 based on RHEL 7.7
glusterfs-6.0-6

How reproducible:
-----------------
4/4

Steps to Reproduce:
-------------------
1. Upgrade all the RHVH 4.3.3 nodes to RHV 4.3.5 based on RHEL 7.7 from RHV Manager UI.
Initial version of gluster here is: glusterfs-3.12.2-47.el7rhgs
Observation: Upgrade successful on all the nodes, reboot successful

2. Upgrade glusterfs packages from glusterfs-3.12.2-47.2 to glusterfs-6.0-6 on one of the node and reboot

Actual results:
----------------
glusterd crashed on the node and never starts up again

Expected results:
-----------------
glusterd should not crash

--- Additional comment from SATHEESARAN on 2019-06-28 02:56:36 UTC ---

Here is the snippet from glusterd.log

<snip>
[2019-06-28 02:55:05.340989] I [MSGID: 106487] [glusterd-handler.c:1498:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2019-06-28 02:55:06.899818] E [MSGID: 101005] [dict.c:2852:dict_serialized_length_lk] 0-dict: value->len (-1162167622) < 0 [Invalid argument]
[2019-06-28 02:55:06.899848] E [MSGID: 106130] [glusterd-handler.c:2633:glusterd_op_commit_send_resp] 0-management: failed to get serialized length of dict
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2019-06-28 02:55:06
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.0
/lib64/libglusterfs.so.0(+0x27240)[0x7f420fbd4240]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f420fbdec64]
/lib64/libc.so.6(+0x363f0)[0x7f420e2103f0]
/lib64/libpthread.so.0(pthread_mutex_lock+0x0)[0x7f420ea14d00]
/lib64/libglusterfs.so.0(__gf_free+0x12c)[0x7f420fc004cc]
/lib64/libglusterfs.so.0(+0x1b889)[0x7f420fbc8889]
/usr/lib64/glusterfs/6.0/xlator/mgmt/glusterd.so(+0x478f8)[0x7f4203d0f8f8]
/usr/lib64/glusterfs/6.0/xlator/mgmt/glusterd.so(+0x44514)[0x7f4203d0c514]
/usr/lib64/glusterfs/6.0/xlator/mgmt/glusterd.so(+0x1d19e)[0x7f4203ce519e]
/usr/lib64/glusterfs/6.0/xlator/mgmt/glusterd.so(+0x24dce)[0x7f4203cecdce]
/lib64/libglusterfs.so.0(+0x66610)[0x7f420fc13610]
/lib64/libc.so.6(+0x48180)[0x7f420e222180]
</snip>

Comment 5 SATHEESARAN 2019-06-28 03:25:03 UTC
All the relevant logs are available as part of the dependent bugs - BZ 1724885

Comment 6 SATHEESARAN 2019-07-17 07:25:40 UTC
Tested with RHVH 4.3.5 based on RHEL 7.7
1. Upgrade was triggered from RHGS 3.4.4 async ( glusterfs-3.12.2-47.2 ) to RHGS 3.5.0 interim ( glusterfs-6.0-7 )
No crashes observed

Comment 8 Yaniv Kaul 2019-11-25 10:17:08 UTC
Why was it moved to NEW again?

Comment 9 SATHEESARAN 2019-11-27 10:35:32 UTC
(In reply to Yaniv Kaul from comment #8)
> Why was it moved to NEW again?

I just wanted to remove the inflight tracker and accidentally changed the state

Comment 14 errata-xmlrpc 2020-02-13 15:57:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0508


Note You need to log in before you can comment on or make changes to this bug.