Bug 894317

Summary: Unable to add a host successfully to a gluster-cluster
Product: Red Hat Enterprise Virtualization Manager Reporter: Shruti Sampat <ssampat>
Component: ovirt-engineAssignee: Yaniv Bronhaim <ybronhei>
Status: CLOSED CURRENTRELEASE QA Contact: Shruti Sampat <ssampat>
Severity: high Docs Contact:
Priority: urgent    
Version: 3.2.0CC: alonbl, dyasny, ecohen, iheim, lpeer, mmahoney, pprakash, Rhev-m-bugs, sdharane, sgrinber, shireesh, vbellur, yeylon, ykaul, yzaslavs
Target Milestone: ---Keywords: Reopened
Target Release: 3.2.0   
Hardware: All   
OS: All   
Whiteboard: gluster
Fixed In Version: sf4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-14 11:03:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Gluster RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs under /var/log/ovirt-engine/host-deploy
none
engine logs none

Description Shruti Sampat 2013-01-11 12:16:38 UTC
Created attachment 676834 [details]
logs under /var/log/ovirt-engine/host-deploy

Description of problem:
---------------------------------------
When trying to add a host to a gluster-cluster from RHEVM, it goes to non-operational state. It is a RHS 2.0plus (RHEL 6.2) based node with the following version of glusterfs installed - 

glusterfs-3.4.0qa5-1.el6rhs.x86_64

The following is the version of vdsm used - 
vdsm-4.9.6-32.0.qa3.el6rhs.x86_64

The following is the version of libvirt used  -
libvirt-0.9.10-21.el6_3.6.x86_64

Version-Release number of selected component (if applicable):
oVirt Engine Version: 3.2.0-4.el6ev 

How reproducible:


Steps to Reproduce:
1. Add a host to a gluster-cluster from RHEVM.
  
Actual results:
Host goes to Non-Operational state.

Expected results:
Host should be successfully added.

Additional info:

Comment 1 Shruti Sampat 2013-01-11 12:17:27 UTC
Created attachment 676835 [details]
engine logs

Comment 2 Alon Bar-Lev 2013-01-12 08:54:03 UTC
I don't think it has nothing to do with deploy.

2013-01-11 17:40:00,082 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (QuartzScheduler_Worker-58) [6c426c65] START, GetHardwareInfoVDSCommand(HostName = 10.70.35.112, HostId = 3db8e38e-31ab-4896-8649-087689ea1e14, vds=org.ovirt.engine.core.common.businessentities.VDS@f474674d), log id: 5cac5a09
2013-01-11 17:40:00,105 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] (QuartzScheduler_Worker-58) [6c426c65] XML RPC error in command GetHardwareInfoVDS ( HostName = 10.70.35.112 ), the error was: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException, <type 'exceptions.Exception'>:method "getVdsHardwareInfo" is not supported 
2013-01-11 17:40:00,105 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (QuartzScheduler_Worker-58) [6c426c65] FINISH, GetHardwareInfoVDSCommand, log id: 5cac5a09
2013-01-11 17:40:00,140 INFO  [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (QuartzScheduler_Worker-58) [1fd4e637] Running command: HandleVdsVersionCommand internal: true. Entities affected :  ID: 3db8e38e-31ab-4896-8649-087689ea1e14 Type: VDS
2013-01-11 17:40:00,179 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (QuartzScheduler_Worker-58) [61692379] Running command: SetNonOperationalVdsCommand internal: true. Entities affected :  ID: 3db8e38e-31ab-4896-8649-087689ea1e14 Type: VDS
2013-01-11 17:40:00,181 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-58) [61692379] START, SetVdsStatusVDSCommand(HostName = 10.70.35.112, HostId = 3db8e38e-31ab-4896-8649-087689ea1e14, status=NonOperational, nonOperationalReason=VERSION_INCOMPATIBLE_WITH_CLUSTER), log id: 72b1f85f
2013-01-11 17:40:00,183 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-58) [61692379] FINISH, SetVdsStatusVDSCommand, log id: 72b1f85f
2013-01-11 17:40:00,213 INFO  [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (QuartzScheduler_Worker-58) [61692379] FINISH, ActivateVdsVDSCommand, log id: 5f973077

Comment 3 Alon Bar-Lev 2013-01-12 15:40:08 UTC
Sorry,
I assigned this for barak, and it was reset.

Barak,
This has something to do with the hardware info query.

Comment 4 Itamar Heim 2013-01-12 21:45:39 UTC
the VDSM version is 3.1. GetHardwareInfoVDSCommand should not be used on a 3.1 cluster level.

Comment 5 Alon Bar-Lev 2013-01-12 21:47:09 UTC
(In reply to comment #4)
> the VDSM version is 3.1. GetHardwareInfoVDSCommand should not be used on a
> 3.1 cluster level.

Or it should be used, but not fail.

Comment 6 Yaniv Bronhaim 2013-01-14 11:03:06 UTC
GetHardwareInfoVDSCommand is not called by the engine when using 3.1 cluster level. 
I verified it again with gluster service enabled.

If you add host with older vdsm version than 3.2 to cluster 3.2, the hardware information request fails.
This senerio includes old version of vdsm on the host that was added to 3.2 cluster and was asked to retrieve the hardware information, this request failed and the engine set the host to non-operational as it supposed to.

Comment 7 Yaniv Bronhaim 2013-01-15 10:37:05 UTC
Request for getHardwareInfo shouldn't appear in this case. Validating VDSM cluster support level wasn't there, only cluster capability was checked.

This http://gerrit.ovirt.org/#/c/11027/ adds it.

Comment 9 Shruti Sampat 2013-03-20 07:17:38 UTC
Verified in Red Hat Enterprise Virtualization Manager Version: 3.2.0-10.14.beta1.el6ev. GetHardwareInfoVDSCommand is not run, and the host goes to Non-Operational state.

Comment 10 Itamar Heim 2013-06-11 09:52:51 UTC
3.2 has been released

Comment 11 Itamar Heim 2013-06-11 09:53:04 UTC
3.2 has been released

Comment 12 Itamar Heim 2013-06-11 09:59:29 UTC
3.2 has been released