Bug 1287681

Summary: Exception getting XMLDesc from vShpere 5.5
Product: Red Hat Enterprise Linux 7 Reporter: Shahar Havivi <shavivi>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: dyuan, jdenemar, juzhou, michal.skrivanek, mzhan, rbalakri, shavivi, tzheng
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-21 07:09:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1284412    
Attachments:
Description Flags
libvirt client log
none
Load vm screenshot from rhevm GUI
none
vdsm log file
none
libvirtd log file none

Description Shahar Havivi 2015-12-02 13:51:01 UTC
Exception raise when trying to probe some domains under vSphere 5.5 via libvirt version:
libvirt-1.2.17-1.fc22.x86_64
libvirt-python-1.2.16-2.fc22.x86_64

The exception raise on some methods (like domain.XMLDesc() and domain.stat()) while other successes (like domain.name()).

Exapmple:
In [19]: all = con.listAllDomains()
In [20]: vm = all[0]
In [21]: vm.name()
Out[21]: 'vm1'
In [22]: vm.XMLDesc()
libvirt: ESX Driver error : Domain not found: Could not find domain with UUID '421629b8-7615-27ef-a454-3269241bd56f'
---------------------------------------------------------------------------
libvirtError                              Traceback (most recent call last)
<ipython-input-22-43175319f441> in <module>()
----> 1 vm.XMLDesc(0)

/usr/lib64/python2.7/site-packages/libvirt.pyc in XMLDesc(self, flags)
    484         capabilities of the host. """
    485         ret = libvirtmod.virDomainGetXMLDesc(self._o, flags)
--> 486         if ret is None: raise libvirtError ('virDomainGetXMLDesc() failed', dom=self)
    487         return ret
    488

libvirtError: Domain not found: Could not find domain with UUID '421629b8-7615-27ef-a454-3269241bd56f'

Comment 1 Michal Skrivanek 2015-12-02 14:34:16 UTC
(In reply to Shahar Havivi from comment #0)
Shahar, I expect some more details and perhaps access to testing instance would be needed.
This blocks a major RHEV 3.6 major feature usability

Comment 2 Jiri Denemark 2015-12-02 14:39:07 UTC
So why is this bug filed against RHEL when you are using Fedora according to the package versions? And would you mind sharing debug logs from libvirt?

Comment 3 Shahar Havivi 2015-12-03 09:33:48 UTC
Created attachment 1101720 [details]
libvirt client log

Comment 4 Shahar Havivi 2015-12-03 09:34:45 UTC
It does the same under rhel 7.2:

libvirt-daemon-1.2.17-5.el7.x86_64
libvirt-python-1.2.17-2.el7.x86_64

The attached log is under rhel 7.2

Comment 5 tingting zheng 2016-01-19 05:59:21 UTC
I tried to reproduce this bug on my environment:
Version:
libvirt-1.2.17-13.el7_2.2.x86_64
vdsm-4.17.15-0.el7ev.noarch

rhevm:
rhevm-3.6.1.1-0.1.el6.noarch

Steps:
1.Launch rhevm GUI.
2.Virtual Machine->Import,fill vcenter 5.5/esxi 5.5 info and username/passowrd for vcenter5.5. 
3.Click Load,guests on esxi 5.5 can be loaded successfully refer to the screenshot attached.
4.Check vdsm log,I found the below error info,but it only exists in log file and not affect the rhevm GUI load vm operation.
5.Check on esxi 5.5,I didn't find a guest whoes uuid is the one missed in log file:"libvirtError: Domain not found: Could not find domain with UUID '4202108e-a139-5ee6-bcb8-ae7e48590877'".

virsh # dumpxml 4202108e-a139-5ee6-bcb8-ae7e48590877
error: failed to get domain '4202108e-a139-5ee6-bcb8-ae7e48590877'
error: Domain not found: No domain with name '4202108e-a139-5ee6-bcb8-ae7e48590877'

# vim vdsm.log
jsonrpc.Executor/3::ERROR::2016-01-19 00:44:58,650::v2v::158::root::(get_external_vms) error getting domain information
Traceback (most recent call last):
  File "/usr/share/vdsm/v2v.py", line 156, in get_external_vms
    _add_vm_info(vm, params)
  File "/usr/share/vdsm/v2v.py", line 662, in _add_vm_info
    if vm.state()[0] == libvirt.VIR_DOMAIN_SHUTOFF:
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2661, in state
    if ret is None: raise libvirtError ('virDomainGetState() failed', dom=self)
libvirtError: Domain not found: Could not find domain with UUID '4202108e-a139-5ee6-bcb8-ae7e48590877'
jsonrpc.Executor/5::DEBUG::2016-01-19 00:44:58,898::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID': u'8049050b-8340-4f13-a726-7ed7d8030789'}
jsonrpc.Executor/5::DEBUG::2016-01-19 00:44:58,898::task::595::Storage.TaskManager.Task::(_updateState) Task=`052d1868-3ec3-4b51-90d5-d1c16f8bcfdf`::moving from state init -> state preparing

Comment 6 tingting zheng 2016-01-19 06:00:04 UTC
Created attachment 1116041 [details]
Load vm screenshot from rhevm GUI

Comment 7 Shahar Havivi 2016-01-19 07:55:28 UTC
(In reply to tingting zheng from comment #5)
You can login via virsh and get the same error, no need for vdsm (you can try to see if there is more info in libvirts log)

Comment 8 tingting zheng 2016-01-19 08:07:10 UTC
(In reply to Shahar Havivi from comment #7)
> (In reply to tingting zheng from comment #5)
> You can login via virsh and get the same error, no need for vdsm (you can
> try to see if there is more info in libvirts log)


In my environment,I can use virsh to list and get xml file via virsh dumpxml the guests successfully.

Both virsh and rhevm GUI can show/load guest successfully,the error only exist in log file and didn't affect these operations.

# virsh -c vpx://root.4.103/tzheng-demo/10.66.106.63/?no_verify=1
Enter root's password for 10.66.4.103: 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # list --all
 Id    Name                           State
----------------------------------------------------
 -     Auto-esx5.5-rhel7-clone        shut off
 -     Auto-esx5.5-rhel7.1-#clone-mulcpu-disk shut off
 -     Auto-esx5.5-rhel7.1-fromtemplate shut off
 -     Auto-esx5.5-rhel7.1-snapshot   shut off
 -     bug-1164853-second             shut off
 -     esx5.5-rhel5.11-i386           shut off
 -     esx5.5-rhel5.11-x86_64         shut off
 -     esx5.5-rhel6.7-i386            shut off
 -     esx5.5-rhel6.7-x86_64          shut off
 -     esx5.5-rhel7.2-x86_64          shut off
 -     esx5.5-win2003-i386            shut off
 -     esx5.5-win2003-x86_64          shut off
 -     esx5.5-win2008-i386            shut off
 -     esx5.5-win2008-x86_64          shut off
 -     esx5.5-win2008r2-x86_64        shut off
 -     esx5.5-win2012-x86_64          shut off
 -     esx5.5-win2012R2-x86_64        shut off
 -     esx5.5-win7-i386               shut off
 -     esx5.5-win7-x86_64             shut off
 -     esx5.5-win8-i386               shut off
 -     esx5.5-win8-x86_64             shut off
 -     esx5.5-win8.1-i386             shut off
 -     esx5.5-win8.1-x86_64           shut off
 -     juzhou-2012r2-efi              shut off
 -     juzhou-test-efi                shut off
 -     rhel6-vmtools-bug1151610-vmtools-auto shut off
 -     rhel7-juzhou-standard-cdrom-multidisks shut off
 -     test#1                         shut off
 -     test-bug1167302                shut off
 -     testblank space                shut off

virsh # dumpxml esx5.5-rhel7.2-x86_64
<domain type='vmware'>
  <name>esx5.5-rhel7.2-x86_64</name>
  <uuid>4218e176-1227-31f4-adc3-824fdc6bd8e7</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <cputune>
    <shares>1000</shares>
  </cputune>
  <os>
    <type arch='x86_64'>hvm</type>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <disk type='file' device='disk'>
      <source file='[ESX5.5] esx5.5-rhel7.2-x86_64/esx5.5-rhel7.2-x86_64.vmdk'/>
      <target dev='sda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='scsi' index='0' model='vmpvscsi'/>
    <interface type='bridge'>
      <mac address='00:50:56:98:8a:53'/>
      <source bridge='VM Network'/>
      <model type='vmxnet3'/>
    </interface>
    <video>
      <model type='vmvga' vram='8192'/>
    </video>
  </devices>
</domain>

virsh #

Comment 9 Michal Privoznik 2016-01-19 16:48:02 UTC
(In reply to tingting zheng from comment #5)
> I tried to reproduce this bug on my environment:
> Version:
> libvirt-1.2.17-13.el7_2.2.x86_64
> vdsm-4.17.15-0.el7ev.noarch
> 

Can you please provide full debug logs (even libvirtd ones)? Thanks.

Comment 10 tingting zheng 2016-01-20 08:11:55 UTC
(In reply to Michal Privoznik from comment #9)
> (In reply to tingting zheng from comment #5)
> > I tried to reproduce this bug on my environment:
> > Version:
> > libvirt-1.2.17-13.el7_2.2.x86_64
> > vdsm-4.17.15-0.el7ev.noarch
> > 
> 
> Can you please provide full debug logs (even libvirtd ones)? Thanks.

After I reboot my esxi 5.5 server,I can not reproduce this bug anymore.

I tried to use the environment as Shahar used in description,I can still get the error info in vdsm.log(attached) while virsh can connect esxi successfully,virsh list --all and dumpxml also works well.

Comment 11 tingting zheng 2016-01-20 08:12:36 UTC
Created attachment 1116546 [details]
vdsm log file

Comment 12 tingting zheng 2016-01-20 08:13:52 UTC
Created attachment 1116547 [details]
libvirtd log file

Comment 13 tingting zheng 2016-01-20 08:55:38 UTC
(In reply to tingting zheng from comment #10)
> (In reply to Michal Privoznik from comment #9)
> > (In reply to tingting zheng from comment #5)
> > > I tried to reproduce this bug on my environment:
> > > Version:
> > > libvirt-1.2.17-13.el7_2.2.x86_64
> > > vdsm-4.17.15-0.el7ev.noarch
> > > 
> > 
> > Can you please provide full debug logs (even libvirtd ones)? Thanks.
> 
> After I reboot my esxi 5.5 server,I can not reproduce this bug anymore.
> 
> I tried to use the environment as Shahar used in description,I can still get
> the error info in vdsm.log(attached) while virsh can connect esxi
> successfully,virsh list --all and dumpxml also works well.

Tried more times,and found that it's really weird,virsh --list fails sometimes but no related info in libvirtd.log.

virsh # list --all
error: Failed to list domains
error: internal error: Could not get UUID of virtual machine

But after a few seconds,it works as below,and fails a few seconds later.
virsh # list --all
 Id    Name                           State
----------------------------------------------------
 333   admin-templater                running
 338   admin-template                 running
 5205  vmrc-fedora-20-ff30-chrome-36  running
 11846 psav-ipa-rhel7                 running
 11861 psav-openldap-rhel7            running
 26750 omaciel-rhel7-01               running
 74436 vmware.idmqe.lab.eng.bos.redhat.com running
 97912 RHOS7-GA                       running
 99770 admin-vm-delete-test-dnd       running
 101556 win7_ohfour_perf               running
……

@Michal,I will send you the info about the environment and you can debug directly,hope that would help to locate the exact problem.

Comment 14 Michal Privoznik 2016-01-20 14:37:57 UTC
So I've managed to get an access to vCenter 5.5 host. Thanks thingting! And I think this is actually their bug. Here is some snippet of SOAP communication between libvirt and vCenter:

1) Call API to list all VMs:
<soapenv:Envelope>
  <soapenv:Body>
    <RetrieveProperties>
      <specSet>
        <propSet>
          <type>VirtualMachine</type>
          <pathSet>configStatus</pathSet>
          <pathSet>runtime.powerState</pathSet>
          <pathSet>config.uuid</pathSet>
          <pathSet>name</pathSet>
        </propSet>
      </specSet>
    </RetrieveProperties>
  </soapenv:Body>
</soapenv:Envelope>

(I've stripped not important things, like xmlns attribs and stuff)

2) vCentre replies with a lot of VMs and for instance with this one:
<returnval>
  <obj type=\"VirtualMachine\">vm-125841</obj>
  <propSet>
    <name>config.uuid</name>
    <val>42165086-3482-2833-8b65-8bf96c5a846d</val>
  </propSet>
  <propSet>
    <name>configStatus</name>
    <val xsi:type=\"ManagedEntityStatus\">green</val>
  </propSet>
  <propSet>
    <name>name</name>
    <val xsi:type=\"xsd:string\">s_tpl_downstream-54z_151201_a4u5F6bX</val>
  </propSet>
  <propSet>
    <name>runtime.powerState</name>
    <val xsi:type=\"VirtualMachinePowerState\">poweredOff</val>
  </propSet>
</returnval>

So vCentre says that UUID is 42165...something. Now let it look up the UUID for us.

3) Call this API to find a managed object by UUID:
<soapenv:Envelope>
  <soapenv:Body>
    <FindByUuid>
      <datacenter>datacenter-21</datacenter>
      <uuid>42165086-3482-2833-8b65-8bf96c5a846d</uuid>
      <vmSearch>true</vmSearch>
    </FindByUuid>
  </soapenv:Body>
</soapenv:Envelope>

4) However, vCenter response is empty:
<soapenv:Envelope>
  <soapenv:Body>
    <FindByUuidResponse></FindByUuidResponse>
  </soapenv:Body>
</soapenv:Envelope>


Because of the contradiction in vCenter replies, I think this is a vCenter bug. If it is so, there's not much libvirt can or should do about.

Comment 15 Michal Privoznik 2016-01-21 07:09:11 UTC
I forgot to mention, that if I try to lookup a different machine it works just fine:

Request:
<soapenv:Envelope>
  <soapenv:Body>
    <FindByUuid>
      <datacenter>datacenter-21</datacenter>
      <uuid>421616f8-032e-26e0-675b-20f6d7f2293e</uuid>
      <vmSearch>true</vmSearch>
    </FindByUuid>
  </soapenv:Body>
</soapenv:Envelope>

Response:
<soapenv:Envelope>
  <soapenv:Body>
    <FindByUuidResponse>
      <returnval type="VirtualMachine">vm-11861</returnval>
    </FindByUuidResponse>
  </soapenv:Body>
</soapenv:Envelope>

Therefore I am closing this one as this is not a libvirt bug. I'd like to report this bug to VMware, however, I have no logs, nor vCenter state, and I don't know what their bugzilla is (if they have one).