Bug 1299927

Summary: Domain is not undefined when live migrated out of the src host.
Product: Red Hat Enterprise Linux 7 Reporter: Roman Hodain <rhodain>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.2CC: agkesos, cww, jdenemar, rbalakri, rhodain
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-10 15:54:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1203710    

Description Roman Hodain 2016-01-19 15:00:34 UTC
Description of problem:
When a VM is created from python:

     self._connection.createXML(domxml, flags)

and it is live migrated from this hypervisor, the domain remains in "virsh list --all" reported as "shut off". It is reporrted as up in the destination hypervisor.

Version-Release number of selected component (if applicable):
libvirt-1.2.17-13.el7_2.2.x86_64

How reproducible:
100% in on environment

Steps to Reproduce:
1. Start VM 
2. Live migrate
3. run virsh list --all on the source host.

Actual results:
Vm is reposted as "shut off" by virsh list

Expected results:
the domain is fogotten by libvirt after the live migration and is not reported at all on the source host.

Additional info:

Comment 5 Jiri Denemark 2016-01-21 13:33:34 UTC
The attached logs claim to be from the destination host (actually, they include more migration attempts, even acting as a source of another migration).

What we need are debug logs from *both* source and destination hosts. It would be cool if the logs contained just the single problematic migration attempt rather than several attempts (but don't strip the messages logged when a daemon starts). That is, stop the daemons, make the logs empty, start the daemons, migrate, fetch the logs.

Comment 7 Alexandros Gkesos 2016-01-22 11:50:02 UTC
Customer is fine with testing it again
To avoid asking again and again for more tests:

Do you want the customer to just start the VM on the "problematic host" and migrate it or to start it on the healthy host, migrate it to the problematic one and then back to the healthy ? (like the previous test)

Also if you prefer the first one, do you want to have the libvirt logs also from the VM start or 

a) Start the VM
b) Clear the logs
c) Restart Libvirt
d) Migrate
e) Gather logs
?

Comment 9 Jiri Denemark 2016-01-22 12:41:07 UTC
(In reply to Alexandros Gkesos from comment #7)
> Customer is fine with testing it again
> To avoid asking again and again for more tests:
> 
> Do you want the customer to just start the VM on the "problematic host" and
> migrate it or to start it on the healthy host, migrate it to the problematic
> one and then back to the healthy ? (like the previous test)

Either way is fine. The following are the steps I'd like them to do:
 
a) Clear the logs
b) Restart libvirtd
c) Start the VM or migrate it from another host
d) Migrate it to another host
e) Check libvirtd still knows about the domain which was just migrated away (virsh list)
e) Gather logs

Comment 16 Jiri Denemark 2016-02-05 12:11:44 UTC
So according to the logs:

2016-01-26 09:54:55.944+0000: 6263: info : virDomainObjListLoadAllConfigs:22615 :
   Scanning for configs in /etc/libvirt/qemu
2016-01-26 09:54:55.944+0000: 6263: info : virDomainObjListLoadAllConfigs:22639 :
   Loading config file 'defulxs2010.xml'

there is a /etc/libvirt/qemu/defulxs2010.xml file, which is not supposed to be there since RHEV uses transient domains. Apparently someone had to define that domain as persistent. However, if vdsClient lists the domain as DOWN after it is migrated from the source host, it must have reported it in the same way even before the domain was initially started on the source host.

Comment 17 Alexandros Gkesos 2016-02-05 13:42:32 UTC
Hello Jiri,

The Host is fully rebooted and checked with vdsClient before the VM start and VM migrate. I am quite sure that the output is completely empty before.

Comment 18 Jiri Denemark 2016-02-05 13:50:59 UTC
Hmm, I guess vdsClient does something different than virsh list. Anyway, can you check if /etc/libvirt/qemu/defulxs2010.xml exists (possibly after restarting the host)?

Comment 19 Alexandros Gkesos 2016-02-10 15:44:49 UTC
Hello Jiri,

Indeed the file is there after restarting the Host.

[root@defulxh0300 ~]# vdsClient -s 0 list


[root@defulxh0300 ~]# virsh -r list
 Id    Name                           State
----------------------------------------------------


[root@defulxh0300 ~]# ls -lrt /etc/libvirt/qemu/
total 12
drwx------. 3 root root 4096 Nov 23 13:49 networks
-rw-------. 1 root root 5112 Jan  8 14:33 defulxs2010.xml

Comment 20 Jiri Denemark 2016-02-10 15:54:34 UTC
OK, than just run "virsh undefine defulxs2010" to remove it.

Comment 21 Alexandros Gkesos 2016-02-12 14:40:27 UTC
I will ask the customer, although after the restart he can migrate the VM to this Host (let's say Host1) and back to another Host (Host2). BUT, he can't migrate it again back to Host1.

That's why i am not sure if it's related, but let's see the results.