Bug 1057088 - ovirt-node cannot be added as a host from webUI
Summary: ovirt-node cannot be added as a host from webUI
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.4.0
Assignee: Yaniv Bronhaim
QA Contact: Pavel Stehlik
URL:
Whiteboard: infra
Depends On:
Blocks: 1058763
TreeView+ depends on / blocked
 
Reported: 2014-01-23 12:50 UTC by Petr Beňas
Modified: 2015-01-04 23:05 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1058763 (view as bug list)
Environment:
Last Closed: 2014-01-28 12:08:42 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)
ovirt.log (213.52 KB, text/plain)
2014-01-24 15:01 UTC, Petr Beňas
no flags Details
ovirt-node.log (16.30 KB, text/plain)
2014-01-24 15:02 UTC, Petr Beňas
no flags Details
vdsm.log (141.07 KB, text/x-log)
2014-01-24 15:03 UTC, Petr Beňas
no flags Details
supervdsm.log (30.26 KB, text/plain)
2014-01-24 15:04 UTC, Petr Beňas
no flags Details

Description Petr Beňas 2014-01-23 12:50:08 UTC
Description of problem:


Version-Release number of selected component (if applicable):
oVirt Node Hypervisor release 3.0.3 (0.999.201401221941draft.el6)

How reproducible:
100%

Steps to Reproduce:
1. Click Hosts tab
2. Click New
3. fill in host's IP address and root password
4. Click OK

Actual results:
Host status changes to Installing, later to Install failed
If you try Re-Install, there's only the Close button available and a error message stating: "There are no ISO versions that are compatible with the Host's current version."

Expected results:


Additional info:
/var/log/vdsm-reg/vdsm-reg.log :
libvir: XML-RPC error : authentication failed: authentication failed
libvir: XML-RPC error : authentication failed: authentication failed
libvir: XML-RPC error : authentication failed: authentication failed
Traceback (most recent call last):
  File "/usr/lib64/python2.6/runpy.py", line 122, in _run_module_as_main
  File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code
  File "/usr/share/vdsm/configNetwork.py", line 751, in <module>
    main()
  File "/usr/share/vdsm/configNetwork.py", line 726, in main
    delNetwork(bridge, **kwargs)
  File "/usr/share/vdsm/configNetwork.py", line 186, in wrapped
    return func(*args, **kwargs)
  File "/usr/share/vdsm/configNetwork.py", line 382, in delNetwork
    _netinfo = netinfo.NetInfo()
  File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 648, in __init__
    _netinfo = get()
  File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 551, in get
    for net, netAttr in networks().iteritems():
  File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 124, in networks
    conn = libvirtconnection.get()
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 146, in get
    conn = utils.retry(libvirtOpenAuth, timeout=10, sleep=0.2)
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 948, in retry
    return func()
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 102, in openAuth
libvirt.libvirtError: authentication failed: authentication failed

ret=255
MainThread::DEBUG::2014-01-23 12:44:29,762::deployUtil::1124::root::makeBridge return.

Comment 1 Petr Beňas 2014-01-23 13:20:09 UTC
Node was installed with SELinux on, but the same problem happens if tested with permissive mode.

Comment 2 Fabian Deutsch 2014-01-23 13:34:44 UTC
Looks more like a vdsm problem with libvirt.

Comment 3 Dan Kenigsberg 2014-01-23 19:11:07 UTC
It seems like Vdsm failed to configure libvirtd or to restart it after configuration. What is the content of /etc/libvirt/libvirt.conf ? If it mentions vdsm, would the problem go away once you `pkill libvirtd` and wait for it to restart?

Comment 4 Petr Beňas 2014-01-24 12:44:51 UTC
Unfortunately I don't have the host in the original state, when it was hitting this bug. Meanwhile I've reinstalled it to Fedora and successfully used as a host virt oVirt 3.4. Reinstalled  back to oVirt node today, but unable to hit the bug with the same scenario. 

Although the host ends up in 'Non Operational' state instead of 'Install failed' (permissive mode). The error message says "The Host emulated machine flags doesn't match one of the cluster emulated machines." which is both not the case (cluester and machine both Ivy) and also probably a different bug. The traceback from vdsm-reg.log is gone, so I guess posting contents of libvirt configuration file makes no sense without reproducing the bug.

Comment 5 Fabian Deutsch 2014-01-24 14:37:32 UTC
(In reply to Petr Beňas from comment #4)
> Although the host ends up in 'Non Operational' state instead of 'Install
> failed' (permissive mode). The error message says "The Host emulated machine
> flags doesn't match one of the cluster emulated machines." which is both not
> the case (cluester and machine both Ivy) and also probably a different bug.
> The traceback from vdsm-reg.log is gone, so I guess posting contents of
> libvirt configuration file makes no sense without reproducing the bug.

I think we should then inbvestigate why it is in an unusable state ..

Comment 6 Fabian Deutsch 2014-01-24 14:52:01 UTC
Sorry, I forgot to add the files which can help us:

Node:
/var/log/vdsm/*
/var/log/ovirt*

On the engine side:

Dan?

Comment 7 Petr Beňas 2014-01-24 15:01:44 UTC
Created attachment 855000 [details]
ovirt.log

Comment 8 Petr Beňas 2014-01-24 15:02:29 UTC
Created attachment 855002 [details]
ovirt-node.log

Comment 9 Petr Beňas 2014-01-24 15:03:01 UTC
Created attachment 855003 [details]
vdsm.log

Comment 10 Petr Beňas 2014-01-24 15:04:02 UTC
Created attachment 855004 [details]
supervdsm.log

Comment 12 Itamar Heim 2014-01-26 08:11:33 UTC
Setting target release to current version for consideration and review. please
do not push non-RFE bugs to an undefined target release to make sure bugs are
reviewed for relevancy, fix, closure, etc.

Comment 13 Dan Kenigsberg 2014-01-27 10:19:14 UTC
Non-Operational state is decided by Engine, based on Vdsm's reports. A quick look at the output of getVdsCaps did not reveal the answer to me. Does Engine report the reason? Does the engine log have hints?

{'HBAInventory': {'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:2e3eb7382147'}], 'FC': []}, 'packages2': {'kernel': {'release': '431.3.1.el6.x86_64', 'buildtime': 1388785167.0, 'version': '2.6.32'}, 'spice-server': {'release': '6.el6_5.1', 'buildtime': 1386756528L, 'version': '0.12.4'}, 'vdsm': {'release': '2.el6', 'buildtime': 1390309864L, 'version': '4.14.1'}, 'qemu-kvm': {'release': '2.415.el6_5.3', 'buildtime': 1386101870L, 'version': '0.12.1.2'}, 'libvirt': {'release': '29.el6_5.2', 'buildtime': 1387360004L, 'version': '0.10.2'}, 'qemu-img': {'release': '2.415.el6_5.3', 'buildtime': 1386101870L, 'version': '0.12.1.2'}, 'mom': {'release': '20140120.gitfd877c5.el6', 'buildtime': 1390225415L, 'version': '0.3.2'}}, 'cpuModel': 'Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz', 'hooks': {}, 'cpuSockets': '1', 'vmTypes': ['kvm'], 'supportedProtocols': ['2.2', '2.3'], 'networks': {}, 'bridges': {';vdsmdummy;': {'addr': '', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'stp': 'off', 'ipv6gateway': '::', 'gateway': '', 'ports': []}}, 'uuid': '4C4C4544-0034-5310-8052-B4C04F4A354A', 'lastClientIface': 'em1', 'nics': {'em1': {'netmask': '255.255.252.0', 'addr': '10.34.62.208', 'hwaddr': 'd4:ae:52:c7:0d:8b', 'cfg': {'DEVICE': 'em1', 'HWADDR': 'd4:ae:52:c7:0d:8b', 'BOOTPROTO': 'dhcp', 'ONBOOT': 'yes', 'PEERNTP': 'yes'}, 'ipv6addrs': ['fe80::d6ae:52ff:fec7:d8b/64', '2620:52:0:223c:d6ae:52ff:fec7:d8b/64'], 'speed': 1000, 'mtu': '1500'}, 'em2': {'netmask': '', 'addr': '', 'hwaddr': 'd4:ae:52:c7:0d:8c', 'cfg': {}, 'ipv6addrs': [], 'speed': 0, 'mtu': '1500'}}, 'software_revision': '2', 'clusterLevels': ['3.0', '3.1', '3.2', '3.3', '3.4'], 'cpuFlags': u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,aperfmperf,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx,f16c,rdrand,lahf_lm,ida,arat,epb,xsaveopt,pln,pts,dts,tpr_shadow,vnmi,flexpriority,ept,vpid,fsgsbase,smep,erms,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge', 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:2e3eb7382147', 'netConfigDirty': 'False', 'supportedENGINEs': ['3.0', '3.1', '3.2', '3.3', '3.4'], 'reservedMem': '321', 'bondings': {'bond4': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond0': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond1': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond2': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}, 'bond3': {'netmask': '', 'addr': '', 'slaves': [], 'hwaddr': '00:00:00:00:00:00', 'cfg': {}, 'ipv6addrs': [], 'mtu': '1500'}}, 'software_version': '4.14', 'memSize': '7843', 'cpuSpeed': '3292.565', 'version_name': 'Snow Man', 'vlans': {}, 'cpuCores': '4', 'kvmEnabled': 'true', 'guestOverhead': '65', 'management_ip': '0.0.0.0', 'cpuThreads': '8', 'emulatedMachines': [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', u'rhel5.4.0'], 'rngSources': ['random'], 'operatingSystem': {'release': '0.999.201401221941draft.el6', 'version': '3.0.3', 'name': 'oVirt Node'}, 'lastClient': '10.35.18.169'}

Comment 14 Petr Beňas 2014-01-27 11:13:43 UTC
As mentioned in comment 4, it reports 'The Host emulated machine flags doesn't match one of the cluster emulated machines.'. engine.log does not give much hints either, attaching related snippet.

Comment 16 Dan Kenigsberg 2014-01-27 18:09:51 UTC
Sorry, Petr, I've missed this infor from comment 4. Could it be that your cluster is Fedora-based (and as such lacks

  'emulatedMachines': [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', u'rhel5.4.0']

)? What are the reported emulatedMachines on other hosts on that cluster?

Comment 17 Petr Beňas 2014-01-28 08:39:23 UTC
I don't have any other host in the cluster, but there was the same node previously in the cluster with Fedora installed on it. The "Emulated machine" field in the cluster's General tab says "pc-1.0", but it doesn't look changeable in the Edit cluster dialog. 

After creating a new cluster (the emulated machine field was empty after creation) I was able to successfully add the ovirt-node host. Does it mean it is not possible to have ovirt-node and Fedora hosts in the same cluster? In RHEV, it's possible to have RHEL and RHEV-H nodes in the same cluster...

Comment 18 Dan Kenigsberg 2014-01-28 12:08:42 UTC
You can have a cluster of Fedora-19 and Fedora-19-based ovirt-node. But you cannot mix el6 and f19 in the same cluster, as their qemus do not support the same "emulated machine types".

Omer may be able to help on how to edit the sticky emulated machine from the cluster definition in Engine.

Please reopen the bug if you can reproduce the original case of comment 0 and provide the information requested in comment 3.

Comment 19 Omer Frenkel 2014-01-28 13:25:57 UTC
Logic is that first host in the cluster set the emulated machine parameter for the cluster, and currently this is not editable by the user (there is a bug on that)

as a work around, you can set this to null in the db, and next host added will set it again:
update vds_groups set emulated_machine=null where name='the_cluster_name';


Note You need to log in before you can comment on or make changes to this bug.