Bug 1058763 - Explain why el6 and Fedora hosts cannot be used in the same cluster
Summary: Explain why el6 and Fedora hosts cannot be used in the same cluster
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Documentation
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ovirt-4.0.1
: ---
Assignee: Stephen Gordon
QA Contact: Gonza
URL:
Whiteboard:
Depends On: 1057088
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-28 13:50 UTC by Petr Beňas
Modified: 2016-07-22 16:39 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1057088
Environment:
Last Closed: 2016-07-22 16:39:19 UTC
oVirt Team: Docs
Embargoed:
ylavi: ovirt-4.0.z?
ylavi: Triaged+
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)

Description Petr Beňas 2014-01-28 13:50:59 UTC
The oVirt Quickstart guide says "In oVirt, you can use either oVirt Node or Fedora as hosts.". I don't see any warning it is not possible to use Fedora and oVirt nodes in the same cluster. Please update the docs or make it possible to use both supported host operationg systems in one cluster. 


+++ This bug was initially created as a clone of Bug #1057088 +++

Description of problem:


Version-Release number of selected component (if applicable):
oVirt Node Hypervisor release 3.0.3 (0.999.201401221941draft.el6)

How reproducible:
100%

Steps to Reproduce:
1. Click Hosts tab
2. Click New
3. fill in host's IP address and root password
4. Click OK

Actual results:
Host status changes to Installing, later to Install failed
If you try Re-Install, there's only the Close button available and a error message stating: "There are no ISO versions that are compatible with the Host's current version."

Expected results:


Additional info:
/var/log/vdsm-reg/vdsm-reg.log :
libvir: XML-RPC error : authentication failed: authentication failed
libvir: XML-RPC error : authentication failed: authentication failed
libvir: XML-RPC error : authentication failed: authentication failed
Traceback (most recent call last):
  File "/usr/lib64/python2.6/runpy.py", line 122, in _run_module_as_main
  File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code
  File "/usr/share/vdsm/configNetwork.py", line 751, in <module>
    main()
  File "/usr/share/vdsm/configNetwork.py", line 726, in main
    delNetwork(bridge, **kwargs)
  File "/usr/share/vdsm/configNetwork.py", line 186, in wrapped
    return func(*args, **kwargs)
  File "/usr/share/vdsm/configNetwork.py", line 382, in delNetwork
    _netinfo = netinfo.NetInfo()
  File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 648, in __init__
    _netinfo = get()
  File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 551, in get
    for net, netAttr in networks().iteritems():
  File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 124, in networks
    conn = libvirtconnection.get()
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 146, in get
    conn = utils.retry(libvirtOpenAuth, timeout=10, sleep=0.2)
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 948, in retry
    return func()
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 102, in openAuth
libvirt.libvirtError: authentication failed: authentication failed

ret=255
MainThread::DEBUG::2014-01-23 12:44:29,762::deployUtil::1124::root::makeBridge return.

--- Additional comment from Petr Beňas on 2014-01-23 08:20:09 EST ---

Node was installed with SELinux on, but the same problem happens if tested with permissive mode.

--- Additional comment from Fabian Deutsch on 2014-01-23 08:34:44 EST ---

Looks more like a vdsm problem with libvirt.

--- Additional comment from Dan Kenigsberg on 2014-01-23 14:11:07 EST ---

It seems like Vdsm failed to configure libvirtd or to restart it after configuration. What is the content of /etc/libvirt/libvirt.conf ? If it mentions vdsm, would the problem go away once you `pkill libvirtd` and wait for it to restart?

--- Additional comment from Petr Beňas on 2014-01-24 07:44:51 EST ---

Unfortunately I don't have the host in the original state, when it was hitting this bug. Meanwhile I've reinstalled it to Fedora and successfully used as a host virt oVirt 3.4. Reinstalled  back to oVirt node today, but unable to hit the bug with the same scenario. 

Although the host ends up in 'Non Operational' state instead of 'Install failed' (permissive mode). The error message says "The Host emulated machine flags doesn't match one of the cluster emulated machines." which is both not the case (cluester and machine both Ivy) and also probably a different bug. The traceback from vdsm-reg.log is gone, so I guess posting contents of libvirt configuration file makes no sense without reproducing the bug.

--- Additional comment from Fabian Deutsch on 2014-01-24 09:37:32 EST ---

(In reply to Petr Beňas from comment #4)
> Although the host ends up in 'Non Operational' state instead of 'Install
> failed' (permissive mode). The error message says "The Host emulated machine
> flags doesn't match one of the cluster emulated machines." which is both not
> the case (cluester and machine both Ivy) and also probably a different bug.
> The traceback from vdsm-reg.log is gone, so I guess posting contents of
> libvirt configuration file makes no sense without reproducing the bug.

I think we should then inbvestigate why it is in an unusable state ..

--- Additional comment from Fabian Deutsch on 2014-01-24 09:52:01 EST ---

Sorry, I forgot to add the files which can help us:

Node:
/var/log/vdsm/*
/var/log/ovirt*

On the engine side:

Dan?

--- Additional comment from Petr Beňas on 2014-01-24 10:01:44 EST ---



--- Additional comment from Petr Beňas on 2014-01-24 10:02:29 EST ---



--- Additional comment from Petr Beňas on 2014-01-24 10:03:01 EST ---



--- Additional comment from Petr Beňas on 2014-01-24 10:04:02 EST ---



--- Additional comment from Petr Beňas on 2014-01-24 10:04:47 EST ---



--- Additional comment from Itamar Heim on 2014-01-26 03:11:33 EST ---

Setting target release to current version for consideration and review. please
do not push non-RFE bugs to an undefined target release to make sure bugs are
reviewed for relevancy, fix, closure, etc.

--- Additional comment from Dan Kenigsberg on 2014-01-27 05:19:14 EST ---

Non-Operational state is decided by Engine, based on Vdsm's reports. A quick look at the output of getVdsCaps did not reveal the answer to me. Does Engine report the reason? Does the engine log have hints?

{'HBAInventory': {'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:2e3eb7382147'}], 'FC': []}, 'packages2': {'kernel': {'release': '431.3.1.el6.x86_64', 'buildtime': 1388785167.0, 'version': '2.6.32'}, 'spice-server': {'release': '6.el6_5.1', 'buildtime': 1386756528L, 'version': '0.12.4'}, 'vdsm': {'release': '2.el6', 'buildtime': 1390309864L, 'version': '4.14.1'}, 'qemu-kvm': {'release': '2.415.el6_5.3', 'buildtime': 1386101870L, 'version': '0.12.1.2'}, 'libvirt': {'release': '29.el6_5.2', 'buildtime': 1387360004L, 'version': '0.10.2'}, 'qemu-img': {'release': '2.415.el6_5.3', 'buildtime': 1386101870L, 'version': '0.12.1.2'}, 'mom': {'release': '20140120.gitfd877c5.el6', 'buildtime': 1390225415L, 'version': '0.3.2'}}, 'cpuModel': 'Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz', 'hooks': {}, 'cpuSockets': '1', 'vmTypes': ['kvm'], 'supportedProtocols': ['2.2', '2.3'], 'networks': {}, 'bridges': {';vdsmdummy;': {'addr': '', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'stp': 'off', 'ipv6gateway': '::', 'gateway': '', 'ports': []}}, 'uuid': '4C4C4544-0034-5310-8052-B4C04F4A354A', 'lastClientIface': 'em1', 'nics': {'em1': {'netmask': '255.255.252.0', 'addr': '10.34.62.208', 'hwaddr': 'd4:ae:52:c7:0d:8b', 'cfg': {'DEVICE': 'em1', 'HWADDR': 'd4:ae:52:c7:0d:8b', 'BOOTPROTO': 'dhcp', 'ONBOOT': 'yes', 'PEERNTP': 'yes'}, 'ipv6addrs': ['fe80::d6ae:52ff:fec7:d8b/64', '2620:52:0:223c:d6ae:52ff:fec7:d8b/64'], 'speed': 1000, 'mtu': '1500'}, 'em2': {'netmask': '', 'addr': '', 'hwaddr': 'd4:ae:52:c7:0d:8c', 'cfg': {}, 'ipv6addrs': [], 'speed': 0, 'mtu': '1500'}}, 'software_revision': '2', 'clusterLevels': ['3.0', '3.1', '3.2', '3.3', '3.4'], 'cpuFlags': u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,aperfmperf,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx,f16c,rdrand,lahf_lm,ida,arat,epb,xsaveopt,pln,pts,dts,tpr_shadow,vnmi,flexpriority,ept,vpid,fsgsbase,smep,erms,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge', 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:2e3eb7382147', 'netConfigDirty': 'False', 'supportedENGINEs': ['3.0', '3.1', '3.2', '3.3', '3.4'], 'reservedMem': '321', 'bondings': {'bond4': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond0': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond1': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond2': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond3': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}}, 'software_version': '4.14', 'memSize': '7843', 'cpuSpeed': '3292.565', 'version_name': 'Snow Man', 'vlans': {}, 'cpuCores': '4', 'kvmEnabled': 'true', 'guestOverhead': '65', 'management_ip': '0.0.0.0', 'cpuThreads': '8', 'emulatedMachines': [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', u'rhel5.4.0'], 'rngSources': ['random'], 'operatingSystem': {'release': '0.999.201401221941draft.el6', 'version': '3.0.3', 'name': 'oVirt Node'}, 'lastClient': '10.35.18.169'}

--- Additional comment from Petr Beňas on 2014-01-27 06:13:43 EST ---

As mentioned in comment 4, it reports 'The Host emulated machine flags doesn't match one of the cluster emulated machines.'. engine.log does not give much hints either, attaching related snippet.

--- Additional comment from Petr Beňas on 2014-01-27 06:15:53 EST ---



--- Additional comment from Dan Kenigsberg on 2014-01-27 13:09:51 EST ---

Sorry, Petr, I've missed this infor from comment 4. Could it be that your cluster is Fedora-based (and as such lacks

  'emulatedMachines': [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', u'rhel5.4.0']

)? What are the reported emulatedMachines on other hosts on that cluster?

--- Additional comment from Petr Beňas on 2014-01-28 03:39:23 EST ---

I don't have any other host in the cluster, but there was the same node previously in the cluster with Fedora installed on it. The "Emulated machine" field in the cluster's General tab says "pc-1.0", but it doesn't look changeable in the Edit cluster dialog. 

After creating a new cluster (the emulated machine field was empty after creation) I was able to successfully add the ovirt-node host. Does it mean it is not possible to have ovirt-node and Fedora hosts in the same cluster? In RHEV, it's possible to have RHEL and RHEV-H nodes in the same cluster...

--- Additional comment from Dan Kenigsberg on 2014-01-28 07:08:42 EST ---

You can have a cluster of Fedora-19 and Fedora-19-based ovirt-node. But you cannot mix el6 and f19 in the same cluster, as their qemus do not support the same "emulated machine types".

Omer may be able to help on how to edit the sticky emulated machine from the cluster definition in Engine.

Please reopen the bug if you can reproduce the original case of comment 0 and provide the information requested in comment 3.

--- Additional comment from Omer Frenkel on 2014-01-28 08:25:57 EST ---

Logic is that first host in the cluster set the emulated machine parameter for the cluster, and currently this is not editable by the user (there is a bug on that)

as a work around, you can set this to null in the db, and next host added will set it again:
update vds_groups set emulated_machine=null where name='the_cluster_name';

Comment 1 Itamar Heim 2014-01-28 13:58:42 UTC
there is no such limitation. you can use fedora based ovirt node with fedora and .el6 based ovirt node with another .el6 ovirt node?

Comment 2 Petr Beňas 2014-01-28 14:05:23 UTC
I just got bug 1057088 closed with comment it is not possible to use ovirt-node and Fedora hosts in the same cluster. https://bugzilla.redhat.com/show_bug.cgi?id=1057088#c18

Comment 3 Fabian Deutsch 2014-01-28 15:25:45 UTC
(In reply to Petr Beňas from comment #2)
> I just got bug 1057088 closed with comment it is not possible to use
> ovirt-node and Fedora hosts in the same cluster.
> https://bugzilla.redhat.com/show_bug.cgi?id=1057088#c18

Hey Petr,

I think that is a missunderstanding.
bug 1057088 comment 18 states that the qemu version is relevant for compatibility.

That means that you can mix any Fedora based host (Nodes and full-blown hosts) in the same cluster -- because their qemu-versions are compatible.

But you _can not_ mix Fedora (ful-blown-host or node) with CentOS (full-blow-host or node) in the same cluster -- because their qemu versions are not compatible.

Doe we agree?

Comment 4 Petr Beňas 2014-01-28 15:59:31 UTC
You're right, I was confused since I was not aware there are two versions of oVirt-node - one Fedora based and other RHEL based. 

But I still consider this bug valid. I am not asking for making Fedora and RHEL hosts to run in the same cluster, but for a documentation change. Even now, when I know there are two different and non-compatible versions of oVirt-node, i still find the sentence "In oVirt, you can use either oVirt Node or Fedora as hosts." misleading. There should be a note about two versions of oVirt Node and probably also about possibility to use RHEL6/Centos6 as a host.

Comment 5 Fabian Deutsch 2014-01-28 16:57:20 UTC
(In reply to Petr Beňas from comment #4)
> You're right, I was confused since I was not aware there are two versions of
> oVirt-node - one Fedora based and other RHEL based. 

In oVirt there are actually just Fedora or CentOS based nodes - not RHEL based ones.

> But I still consider this bug valid. I am not asking for making Fedora and
> RHEL hosts to run in the same cluster, but for a documentation change. Even
> now, when I know there are two different and non-compatible versions of
> oVirt-node, i still find the sentence "In oVirt, you can use either oVirt
> Node or Fedora as hosts." misleading. There should be a note about two
> versions of oVirt Node and probably also about possibility to use
> RHEL6/Centos6 as a host.

Yes, we should definitely add a sentence which

a. points out the technical requirements for hosts to be in the same cluster (aka qemu with same features)

b. And that is the reason why Fedora and CentOS/EL6 Nodes/Hosts can not be in the same cluster.

Comment 6 Sandro Bonazzola 2014-03-04 09:18:49 UTC
This is an automated message.
Re-targeting all non-blocker bugs still open on 3.4.0 to 3.4.1.

Comment 7 Sandro Bonazzola 2014-06-11 06:50:49 UTC
This is an automated message:
This bug has been re-targeted from 3.4.2 to 3.5.0 since neither priority nor severity were high or urgent. Please re-target to 3.4.3 if relevant.

Comment 9 Sandro Bonazzola 2015-09-04 08:59:41 UTC
This is an automated message.
This Bugzilla report has been opened on a version which is not maintained anymore.
Please check if this bug is still relevant in oVirt 3.5.4.
If it's not relevant anymore, please close it (you may use EOL or CURRENT RELEASE resolution)
If it's an RFE please update the version to 4.0 if still relevant.

Comment 10 Gonza 2015-10-05 13:05:27 UTC
pbenas original concern is still valid.
It is still not clear under [1] what are the technical requirements for more than 1 host to be on the same cluster.

[1] http://www.ovirt.org/Quick_Start_Guide

Comment 11 Red Hat Bugzilla Rules Engine 2015-10-19 10:57:53 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 12 Sandro Bonazzola 2015-10-26 12:32:06 UTC
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high

Comment 13 Sandro Bonazzola 2016-05-02 09:49:50 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 15 Fabian Deutsch 2016-07-22 16:39:19 UTC
It is around for more than 2 years. Time to close it.


Note You need to log in before you can comment on or make changes to this bug.