Bug 1550389

Summary: RHGS 3.4 node goes to non operational state when imported into RHGS-C
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sweta Anandpara <sanandpa>
Component: rhscAssignee: Sahina Bose <sabose>
Status: CLOSED CURRENTRELEASE QA Contact: Sweta Anandpara <sanandpa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: rhinduja, rhs-bugs, rhsc-qe-bugs, sanandpa, storage-qa-internal
Target Milestone: ---Keywords: Regression
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-06 04:18:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1503070, 1503137    

Description Sweta Anandpara 2018-03-01 07:35:33 UTC
Description of problem:
=======================

Had a 6node cluster with an interim build of RHGS 3.4 (glusterfs-3.12.2-4). Created a RHGS-Console mgmt node and imported the cluster. The entire package installation (on RHGS nodes) went through successfully, and in the end the nodes went to 'Non operational' state with the below error:

2018-03-01 11:22:27,136 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-15) [4626e374] Correlation ID: 4626e374, Job ID: 4d8aaee8-9735-4a3b-a95e-d10e652acbfe, Call Stack: null, Custom Event ID: -1, Message: Host dhcp37-210.lab.eng.blr.redhat.com is installed with VDSM version (4.19) and cannot join cluster rhel7.5_rhgs34_6node which is compatible with VDSM versions [4.13, 4.14, 4.9, 4.16, 4.11, 4.15, 4.12, 4.10].
2018-03-01 11:22:27,161 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-15) [4626e374] Host d5717985-8a8a-4d7f-9366-dbda67f3ae47 : dhcp37-210.lab.eng.blr.redhat.com is already in NonOperational status for reason VERSION_INCOMPATIBLE_WITH_CLUSTER. SetNonOperationalVds command is skipped.

The vdsm, ovirt engine logs and sosreports can be accessed at:
Step1: mount -t cifs -oguest //newtssuitcase.lab.eng.blr.redhat.com/newtssuitcase <mountpath>
Step2: cd <mountpath>/BZ_sosreports/<bugID>

Version-Release number of selected component (if applicable):
============================================================
[root@dhcp37-95 ~]# rpm -qa | grep vdsm
vdsm-4.19.43-2.1.el7rhgs.x86_64
vdsm-xmlrpc-4.19.43-2.1.el7rhgs.noarch
vdsm-jsonrpc-4.19.43-2.1.el7rhgs.noarch
vdsm-cli-4.19.43-2.1.el7rhgs.noarch
vdsm-api-4.19.43-2.1.el7rhgs.noarch
vdsm-python-4.19.43-2.1.el7rhgs.noarch
vdsm-yajsonrpc-4.19.43-2.1.el7rhgs.noarch
vdsm-gluster-4.19.43-2.1.el7rhgs.noarch
[root@dhcp37-95 ~]# 



How reproducible:
==================
1:1 (on all 6nodes of the cluster)


Additional info:
================

[root@dhcp37-95 ~]# gluster peer status
Number of Peers: 5

Hostname: dhcp37-56.lab.eng.blr.redhat.com
Uuid: 2511189b-23b0-4ef8-858c-0b0220b22276
State: Peer in Cluster (Connected)

Hostname: dhcp37-210.lab.eng.blr.redhat.com
Uuid: 2058367a-963e-4b9d-b451-00748cf1e22f
State: Peer in Cluster (Connected)

Hostname: dhcp37-216.lab.eng.blr.redhat.com
Uuid: 524c7a23-ead0-4970-b9e4-5a52adda483a
State: Peer in Cluster (Connected)

Hostname: dhcp37-57.lab.eng.blr.redhat.com
Uuid: d05a6a6a-1fe6-4918-8f78-f1cf01d63681
State: Peer in Cluster (Connected)

Hostname: dhcp37-44
Uuid: 074ebb84-3618-45a2-81bc-871b7d78a1db
State: Peer in Cluster (Connected)
[root@dhcp37-95 ~]# 
[root@dhcp37-95 ~]# rpm -qa | grep gluster
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-client-xlators-3.12.2-4.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-12.el7.x86_64
glusterfs-server-3.12.2-4.el7rhgs.x86_64
python2-gluster-3.12.2-4.el7rhgs.x86_64
glusterfs-libs-3.12.2-4.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
glusterfs-cli-3.12.2-4.el7rhgs.x86_64
glusterfs-3.12.2-4.el7rhgs.x86_64
glusterfs-api-3.12.2-4.el7rhgs.x86_64
glusterfs-fuse-3.12.2-4.el7rhgs.x86_64
glusterfs-rdma-3.12.2-4.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-4.el7rhgs.x86_64
vdsm-gluster-4.19.43-2.1.el7rhgs.noarch
[root@dhcp37-95 ~]# 
[root@dhcp37-95 ~]# 
[root@dhcp37-95 ~]# gluster v list
rep
temp
[root@dhcp37-95 ~]#

Comment 2 Sahina Bose 2018-03-07 08:19:46 UTC
This requires change in RHGS-C as well as vdsm.
- RHGS-C needs to accept 4.19 version
- vdsm needs to report capability of 3.5 cluster level as well. Sweta, can you raise a corresponding bug in vdsm component too?

Comment 8 Sahina Bose 2018-03-27 13:21:41 UTC
After changing bug 1553130, we realised that there was no change needed in RHGS-C. So this bug can be verified with just the vdsm rpms that fix bug 1553130.