Bug 818105 - [ovirt] [vdsm] [gluster]: vdsm fails to respond to getVdsCaps on non-virt host
[ovirt] [vdsm] [gluster]: vdsm fails to respond to getVdsCaps on non-virt host
Status: CLOSED CURRENTRELEASE
Product: oVirt
Classification: Community
Component: vdsm (Show other bugs)
unspecified
Unspecified Linux
unspecified Severity high
: ---
: ---
Assigned To: Dan Kenigsberg
Haim
infra
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-02 04:46 EDT by Haim
Modified: 2014-01-12 19:51 EST (History)
14 users (show)

See Also:
Fixed In Version: v4.10.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-17 14:23:50 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Haim 2012-05-02 04:46:40 EDT
Description of problem:

scenario: gluster support 

flow: add host 

result: when host goes up, getVdsCaps ends with exception 

reason: host has no virt capabilities, which in gluster-case, is more likely, need to add logic that test if host goes into gluster-cluster, and set proper flags in config file. 

workaround: add "fake_kvm_support=True" to vdsm.conf
Comment 1 Itamar Heim 2012-05-03 14:32:45 EDT
I assume there is a better way for this than fake_kvm_support...
Comment 2 Dan Kenigsberg 2012-05-06 03:42:49 EDT
No vdsm version. No vdsm.log. If I were Kaul I would have closed this bug on sight.
Comment 3 Haim 2012-05-06 04:17:24 EDT
(In reply to comment #2)
> No vdsm version. No vdsm.log. If I were Kaul I would have closed this bug on
> sight.

Its not working by design, you don't really need the logs, but if you do: 

Thread-21::DEBUG::2012-05-06 11:16:35,920::BindingXMLRPC::877::vds::(wrapper) client [10.35.97.135]::call getCapabilities with () {} flowID [68cc9fce]
Thread-21::ERROR::2012-05-06 11:16:35,949::BindingXMLRPC::886::vds::(wrapper) libvirt error
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 882, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/BindingXMLRPC.py", line 291, in getCapabilities
    ret = api.getCapabilities()
  File "/usr/share/vdsm/API.py", line 1075, in getCapabilities
    c = caps.get()
  File "/usr/share/vdsm/caps.py", line 215, in get
    _getCompatibleCpuModels())
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 795, in __call__
    value = self.func(*args)
  File "/usr/share/vdsm/caps.py", line 129, in _getCompatibleCpuModels
    in allModels if compatible(model, vendor) ]
  File "/usr/share/vdsm/caps.py", line 125, in compatible
    return c.compareCPU(xml, 0) in (
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2389, in compareCPU
    if ret == -1: raise libvirtError ('virConnectCompareCPU() failed', conn=self)
libvirtError: Requested operation is not valid: cannot get host CPU capabilities

as for version: 

vdsm-4.9.6-0.153.gitf628baf.fc16.x86_64, homemade one by shiressh.
Comment 4 Dan Kenigsberg 2012-05-06 05:35:02 EDT
(In reply to comment #3)
> 
> Its not working by design, you don't really need the logs, but if you do: 

Not true. This SHOULD work, and this

>     return c.compareCPU(xml, 0) in (
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2389, in
> compareCPU
>     if ret == -1: raise libvirtError ('virConnectCompareCPU() failed',
> conn=self)
> libvirtError: Requested operation is not valid: cannot get host CPU
> capabilities


is a libvirt bug. Which libvirt version is it? I think the bug was open long ago, maybe even fixed for f17.
Comment 5 Timothy Asir 2012-05-08 08:31:19 EDT

I tried add host from ovirt UI and filled necessary information.  Its failed with an error:" Failed to install Host fc38. Step: VT_SVM; Details: Server does not support virtualization."

This is obvious as the node is an VM.  We know gluster is an addon support to vdsm ie vdsm comes with virt support by default. I also tried by vdsm-hook-faqemu installed in the node, but same result

If the bug report is related to this case, I guess its not a bug.

I see vds_bootstrap.py has support to disable virtualization service (-V option).  However this is not yet used in engine/ui I guess.

Below is log snippet for reference

The vds_installer.log contains:
Tue, 08 May 2012 17:35:47 DEBUG    trying to fetch vds_bootstrap.py script cmd = '/usr/bin/curl -s -k -w %{http_code} -o /tmp/vds_bootstrap_a8967de4-9\
9e4-4dd2-8e09-b8549b07fc0c.py http://FC16-1:80/Components/vds/vds_bootstrap.py'
Tue, 08 May 2012 17:35:47 DEBUG    <BSTRAP component='INSTALLER' status='OK' message='vds_bootstrap.py download succeeded'/>
Tue, 08 May 2012 17:35:47 DEBUG    trying to run /tmp/vds_bootstrap_a8967de4-99e4-4dd2-8e09-b8549b07fc0c.py script cmd = '/tmp/vds_bootstrap_a8967de4-\
99e4-4dd2-8e09-b8549b07fc0c.py -g -O gluster -t 2012-05-08T12:05:46 -f /tmp/firewall.conf.a8967de4-99e4-4dd2-8e09-b8549b07fc0c http://FC16-1:80/Compon\
ents/vds/ 192.168.2.38 a8967de4-99e4-4dd2-8e09-b8549b07fc0c'
Comment 6 Itamar Heim 2012-05-08 16:30:00 EDT
oved - isn't this supposed to be fixed?
Comment 7 Oved Ourfali 2012-05-09 01:20:20 EDT
Timothy, at start the gluster code Shireesh added didn't allow a cluster not to have virt capabilities, thus even if you choose a cluster to be "gluster cluster", it still has the enable virt cluster on. 
As a result, when adding a host to this cluster, we do the virt validations (and we are not passing -V to disable virt). I don't know if the issue was fixed, and if so, if you are working with a version with it. Can you check the enable_virt and enable_gluster in the database, vds_groups table?

Itamar - I guess you were referring to Timothy's issue in your question. The issue the bug talks about is a bit different, and AFAIU not related to our changes... but until the gluster guys fix the issue above, even if a host is added, and the cpu flags are reported, the engine will bring the host to non-operational, as it is searching for virt flags/capabilities.
Comment 8 Timothy Asir 2012-05-09 06:34:45 EDT
I manually update the enable_virt flag in vds group table and tried add host flow. Its works for me.
Comment 9 Shireesh 2012-05-11 04:51:05 EDT
Initially the system was designed for only following two modes on a cluster:
 - Virt only
 - Virt + Gluster

A VM cannot be used as a host in both these modes, and hence I believe the behavior was expected, except for the libvirt bug Dan mentioned.

It has recently been decided that "Gluster only" will also be a valid configuration for a cluster, and the UI is being enhanced to support this as well.
Comment 10 Shireesh 2012-05-11 04:52:26 EDT
When I say "system" above, I mean the webadmin UI part.
Comment 11 Haim 2012-05-12 12:43:15 EDT
Daniel, please validate that when using proper libvirt version, non-virt host (vm) can reply safely to getVdsCaps
Comment 12 Oved Ourfali 2012-05-13 11:00:58 EDT
Looks like this is the libvirt bug
https://bugzilla.redhat.com/show_bug.cgi?id=770285

Can we close this bug as duplicate?
Or make this bug depend on the bug above?
Comment 13 Yaniv Kaul 2012-05-13 11:04:50 EDT
(In reply to comment #12)
> Looks like this is the libvirt bug
> https://bugzilla.redhat.com/show_bug.cgi?id=770285

Which is waiting on NEEDINFO to Idan, who left Red Hat and is not going to respond to the NEEDINFO most likely...

> 
> Can we close this bug as duplicate?
> Or make this bug depend on the bug above?

(In reply to comment #12)
> Looks like this is the libvirt bug
> https://bugzilla.redhat.com/show_bug.cgi?id=770285
> 
> Can we close this bug as duplicate?
> Or make this bug depend on the bug above?
Comment 14 Oved Ourfali 2012-05-13 11:10:01 EDT
(In reply to comment #13)
> (In reply to comment #12)
> > Looks like this is the libvirt bug
> > https://bugzilla.redhat.com/show_bug.cgi?id=770285
> 
> Which is waiting on NEEDINFO to Idan, who left Red Hat and is not going to
> respond to the NEEDINFO most likely...
> 
Indeed. But that doesn't mean we can't close this bug (unless someone thinks the scenario is different).
> > 
> > Can we close this bug as duplicate?
> > Or make this bug depend on the bug above?
> 
> (In reply to comment #12)
> > Looks like this is the libvirt bug
> > https://bugzilla.redhat.com/show_bug.cgi?id=770285
> > 
> > Can we close this bug as duplicate?
> > Or make this bug depend on the bug above?
Comment 15 Yaniv Kaul 2012-05-13 11:15:30 EDT
(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > Looks like this is the libvirt bug
> > > https://bugzilla.redhat.com/show_bug.cgi?id=770285
> > 
> > Which is waiting on NEEDINFO to Idan, who left Red Hat and is not going to
> > respond to the NEEDINFO most likely...
> > 
> Indeed. But that doesn't mean we can't close this bug (unless someone thinks
> the scenario is different).

I guess it depends if we want to get the bug fixed or closed. Your call.
Personally, and from QE perspective, it makes more sense to push the other bug to some resolution first.

> > > 
> > > Can we close this bug as duplicate?
> > > Or make this bug depend on the bug above?
> > 
> > (In reply to comment #12)
> > > Looks like this is the libvirt bug
> > > https://bugzilla.redhat.com/show_bug.cgi?id=770285
> > > 
> > > Can we close this bug as duplicate?
> > > Or make this bug depend on the bug above?
Comment 16 Timothy Asir 2012-05-16 06:46:27 EDT
I got the libvirtError ('virConnectCompareCPU() failed') error during getVdsCaps command execution in the VMs which are created thru RHEL6 KVM hypervisor.

When trying to narrow down the cause, I have found out that a cpu flag called 'pat' is missing or set to 'disable/off' for the VMs. 

After I enable the cpu flag 'pat' (changed the value to 'require' and applied) and restart the system, It was working fine and could execute getVdsCaps command successfully.

Is this expected behaviour?
Comment 17 Dan Kenigsberg 2012-05-16 08:45:59 EDT
Timothy, I do not know much about the 'pat' flag or where you 'require' it.

Does this patch

http://gerrit.ovirt.org/4464

solves the issue for you even without change in the guest?
Comment 18 Timothy Asir 2012-05-16 09:03:43 EDT
Yes, it solves the issue for me.

Note You need to log in before you can comment on or make changes to this bug.