Bug 1177733

Summary: Guest with a volume disk fail to start automatically after libvirtd restart
Product: Red Hat Enterprise Linux 7 Reporter: Pei Zhang <pzhang>
Component: libvirtAssignee: Erik Skultety <eskultet>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, eskultet, mzhan, rbalakri, shyu, xuzhang, yanyang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.15-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 06:06:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pei Zhang 2014-12-30 07:05:06 UTC
Description of problem:
Guest with a volume disk fail to start automatically after restart libvirtd 

Version :
libvirt-1.2.8-11.el7.x86_64
qemu-kvm-rhev-2.1.2-17.el7.x86_64
kernel-3.10.0-220.el7.x86_64

How reproducible:
100%

Steps to Reproduce:

1.define and start a guest with volume disk.

# virsh dumpxml r7q2 | grep disk -A 9
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/r7q2-17.img'/>
      <backingStore/>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='volume' device='disk'>
      <driver name='qemu'/>
      <source pool='default' volume='qcow2.img'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>

# virsh list 
 Id    Name                           State
----------------------------------------------------
 44    r7q2                           running

# virsh pool-list --all
 Name                 State      Autostart 
-------------------------------------------
 default              active     yes       

2. restart libvirtd
# service libvirtd restart
Redirecting to /bin/systemctl restart  libvirtd.service

3.check guest state

# virsh list --all
----------------------------------------------------
 -     r7q2                           shut off

Actual result :
As step 3 , guest is shut off after restart libvirtd .

Expect result :
In step 3 , guest is still running .

Additional info :
2014-12-30 06:04:52.553+0000: 25271: debug : virStoragePoolLookupByName:12701 : conn=0x7f56bc33e2c0, name=default
2014-12-30 06:04:52.553+0000: 25271: debug : virStoragePoolIsActive:16734 : pool=0x7f56b4001380
2014-12-30 06:04:52.553+0000: 25271: error : virStorageTranslateDiskSourcePool:3042 : unsupported configuration: storage pool 'default' containing volume 'qcow2.img' is not active
2014-12-30 06:04:52.553+0000: 25271: debug : virStoragePoolFree:13137 : pool=0x7f56b4001380
2014-12-30 06:04:52.553+0000: 25271: debug : qemuDomainObjEndJob:1490 : Stopping job: modify (async=none vm=0x7f56bc31b130 name=r7q2)
2014-12-30 06:04:52.553+0000: 25271: debug : qemuProcessStop:4920 : Shutting down vm=0x7f56bc31b130 name=r7q2 id=53 pid=25066 flags=0

Comment 2 Erik Skultety 2015-04-02 15:45:46 UTC
*** Bug 1125805 has been marked as a duplicate of this bug. ***

Comment 3 Erik Skultety 2015-04-07 14:36:13 UTC
fixed upstream:

commit 2a31c5f030a14d85055dd8c40cf309995b09c112
Author: Erik Skultety <eskultet>
Date:   Mon Mar 16 16:30:03 2015 +0100

    storage: Introduce storagePoolUpdateAllState function
    
    The 'checkPool' callback was originally part of the storageDriverAutostart function,
    but the pools need to be checked earlier during initialization phase,
    otherwise we can't start a domain which mounts a volume after the
    libvirtd daemon restarted. This is because qemuProcessReconnect is called
    earlier than storageDriverAutostart. Therefore the 'checkPool' logic has been
    moved to storagePoolUpdateAllState which is called inside storageDriverInitialize.
    
    We also need a valid 'conn' reference to be able to execute 'refreshPool'
    during initialization phase. Though it isn't available until storageDriverAutostart
    all of our storage backends do ignore 'conn' pointer, except for RBD,
    but RBD doesn't support 'checkPool' callback, so it's safe to pass
    conn = NULL in this case.

v1.2.14-52-g2a31c5f

The whole v2 series with follow-ups can be found here https://www.redhat.com/archives/libvir-list/2015-April/msg00088.html

Comment 5 Pei Zhang 2015-06-10 06:01:11 UTC
Verify version :
libvirt-1.2.16-1.el7.x86_64
qemu-kvm-rhev-2.3.0-2.el7.x86_64

steps :
1. start a guest with volume disk 

# virsh list 
 Id    Name                           State
----------------------------------------------------
 20    r72                            running

# virsh dumpxml r72 |grep disk -A 9
......
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/r7.2.qcow2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <boot order='1'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    <disk type='volume' device='disk'>  <====logical volume disk 
      <driver name='qemu' type='raw'/>
      <source pool='logical-pool' volume='vol1'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </disk>
    <disk type='volume' device='disk'> <====disk volume disk 
      <driver name='qemu' type='raw'/>
      <source pool='disk-pool' volume='sdc1'/>
      <backingStore/>
      <target dev='vde' bus='virtio'/>
      <alias name='virtio-disk4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </disk>

2. check pool status 
# virsh pool-list 
 Name                 State      Autostart 
-------------------------------------------
 default              active     yes       
 disk-pool            active     no        
 gluster-pool         active     no        
 logical-pool         active     no        
 netfs-nfs-pool       active     no        

# virsh vol-list disk-pool
 Name                 Path                                    
------------------------------------------------------------------------------
 sdc1                 /dev/sdc1                               
# virsh vol-list logical-pool 
 Name                 Path                                    
------------------------------------------------------------------------------
 vol1                 /dev/logical-pool/vol1       

3. restart libvirtd 

# service libvirtd restart
Redirecting to /bin/systemctl restart  libvirtd.service

4. check guest status , guest was shutdown 
# virsh list 
 Id    Name                           State
----------------------------------------------------

5.check pool status , disk pool and gluster pool were destroyed . 
# virsh pool-list 
 Name                 State      Autostart 
-------------------------------------------
 default              active     yes       
 logical-pool         active     no        
 netfs-nfs-pool       active     no        

Note :
In step 4 and step 5 ,  guest with volume disk was destroyed , disk pool and gluster  pool are inactive after restart libvirtd .And dir pool , logical  pool and netfs  pool are still active . So I think expected results are that guest is still running and all the pools should remain active .
Thanks .

Comment 6 Erik Skultety 2015-06-10 08:58:33 UTC
Yes, I agree, those are expected results, but currently RBD, Gluster, Disk, Sheepdog lack check support. The logic behind storage state files is that any storage which doesn't support check method is marked as inactive by default. As soon as there is any support for pool checking for those mentioned above, those pools won't disappear anymore.
Anyhow, if you created those pools as permanent, i.e. pool-define->pool-start, you can list such pools using 'pool-list --all/--inactive' and restart them manually. The guest will then start normally. However if the pool was created as transient (pool-create/pool-create-as), no pool configuration's been stored (only state XML) and such pool will disappear completely after daemon restart.

Comment 7 Pei Zhang 2015-06-11 10:30:39 UTC
Thanks for your info .
And according the info from you . I tested Dir , fs , netfs , logical , iscsi , .Guest with volume in these pool running well even through restart libvirtd .
A bit issue about mpath fill a bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1230664

According to comment5 , comment 6 , move this bug to verified .

Comment 9 errata-xmlrpc 2015-11-19 06:06:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html