Bug 1362349

Summary: Persistent fs pool is undefined after startup fails
Product: Red Hat Enterprise Linux 7 Reporter: Yang Yang <yanyang>
Component: libvirtAssignee: John Ferlan <jferlan>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: dyuan, rbalakri, ydu, yisun
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-2.0.0-5.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:51:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirtd.log none

Description Yang Yang 2016-08-02 03:05:22 UTC
Description of problem:
I define and build a fs pool with xfs format. Then I edit the pool and change the format to ext4. And then I start the pool with no-overwrite flag. The pool startup fails, however, the pool is unexpectedly undefined.

Version-Release number of selected component (if applicable):
libvirt-2.0.0-3.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. define a fs pool
# cat /yy/fs-pool.xml
<pool type='fs'>
  <name>fs</name>
  <source>
    <device path='/dev/sdc'/>
    <format type='xfs'/>
  </source>
  <target>
    <path>/var/lib/libvirt/images/fs</path>
  </target>
</pool>

# virsh pool-define /yy/fs-pool.xml
Pool fs defined from /yy/fs-pool.xml

2. build fs pool
# virsh pool-build fs
Pool fs built
# virsh pool-list --all --type fs
 Name                 State      Autostart
-------------------------------------------
 fs                   inactive   no  

3. edit fs pool, change source format to ext4
# virsh pool-edit fs

<format type='ext4'/>

Pool fs XML configuration edited.

4. start fs pool with no-overwrite flag
# virsh pool-start fs --no-overwrite
error: Failed to start pool fs
error: Failed to make filesystem of type 'ext4' on device '/dev/sdc': Invalid argument

[root@rhel7_test ~]# virsh pool-list --all --type fs
 Name                 State      Autostart
-------------------------------------------

Actual results:
fs pool is undefined after startup fails

Expected results:
fs pool is not undefined

Additional info:

Comment 2 Yang Yang 2016-08-03 10:09:51 UTC
Created attachment 1186999 [details]
libvirtd.log

Comment 3 John Ferlan 2016-08-04 19:38:49 UTC
I've tried to reproduce with vary degrees of success, but I do know what the problem is at least.  The code in storagePoolCreate which allows a poolBuild to be attempted was essentially copied from storagePoolCreateXML (e.g. the non-persistent storage pool).  So when the failure to build occurs, the pool object is destroyed via a call to virStoragePoolObjRemove, which shouldn't happen for the storagePoolCreate path since it's using a previously defined pool. The following snippet of the libvirtd.log proves the theory:

...
2016-08-03 10:08:17.728+0000: 14964: debug : virStorageBackendMakeFileSystem:763 : source device: '/dev/sdb' format: 'ext4'
2016-08-03 10:08:17.728+0000: 14964: debug : virStorageBackendFileSystemProbe:632 : Probing for existing filesystem of type ext4 on device /dev/sdb
2016-08-03 10:08:17.729+0000: 14964: info : virStorageBackendFileSystemProbe:663 : No filesystem of type 'ext4' found on device '/dev/sdb'
2016-08-03 10:08:17.730+0000: 14964: debug : virCommandRunAsync:2429 : About to run /usr/sbin/mkfs -t ext4 /dev/sdb
2016-08-03 10:08:17.732+0000: 14964: debug : virFileClose:102 : Closed fd 24
2016-08-03 10:08:17.732+0000: 14964: debug : virFileClose:102 : Closed fd 26
2016-08-03 10:08:17.732+0000: 14964: debug : virFileClose:102 : Closed fd 28
2016-08-03 10:08:17.732+0000: 14964: debug : virCommandRunAsync:2432 : Command result 0, with PID 15456
2016-08-03 10:08:17.740+0000: 14964: error : virCommandWait:2553 : internal error: Child process (/usr/sbin/mkfs -t ext4 /dev/sdb) unexpected exit status 1: 2016-08-03 10:08:17.733+0000: 15456: debug : virFileClose:102 : Closed fd 26
2016-08-03 10:08:17.733+0000: 15456: debug : virFileClose:102 : Closed fd 28
2016-08-03 10:08:17.733+0000: 15456: debug : virFileClose:102 : Closed fd 24
mke2fs 1.42.9 (28-Dec-2013)

2016-08-03 10:08:17.740+0000: 14964: debug : virCommandRun:2280 : Result status 0, stdout: '/dev/sdb is entire device, not just one partition!
Proceed anyway? (y,n) ' stderr: '2016-08-03 10:08:17.733+0000: 15456: debug : virFileClose:102 : Closed fd 26
2016-08-03 10:08:17.733+0000: 15456: debug : virFileClose:102 : Closed fd 28
2016-08-03 10:08:17.733+0000: 15456: debug : virFileClose:102 : Closed fd 24
mke2fs 1.42.9 (28-Dec-2013)
'
2016-08-03 10:08:17.740+0000: 14964: debug : virFileClose:102 : Closed fd 25
2016-08-03 10:08:17.740+0000: 14964: debug : virFileClose:102 : Closed fd 27
2016-08-03 10:08:17.740+0000: 14964: error : virStorageBackendExecuteMKFS:725 : Failed to make filesystem of type 'ext4' on device '/dev/sdb': Invalid argument
2016-08-03 10:08:17.740+0000: 14964: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x7f78c8006440
2016-08-03 10:08:17.740+0000: 14964: info : virObjectUnref:261 : OBJECT_DISPOSE: obj=0x7f78c8006440
2016-08-03 10:08:17.740+0000: 14964: debug : virStoragePoolDispose:515 : release pool 0x7f78c8006440 fs 8c8d7ffc-4fc4-4c6c-a710-68cda3357b6e
...


A patch has been posted upstream to resolve the issue, see:

http://www.redhat.com/archives/libvir-list/2016-August/msg00297.html

Comment 4 John Ferlan 2016-08-05 13:35:26 UTC
Pushed upstream:

commit fbfd6f2103a56df73238b023032dfb1242a4d7d5
Author: John Ferlan <jferlan>
Date:   Thu Aug 4 15:24:48 2016 -0400

    storage: Don't remove the pool for buildPool failure in storagePoolCreate
    
...
    
    When adding the ability to build the pool during the start pool processing
    using the similar flags as buildPool processing would use, the code was
    essentially cut-n-pasted from storagePoolCreateXML.  However, that included
    a call to virStoragePoolObjRemove which shouldn't happen within the
    storagePoolCreate path since that'll remove the pool from the list of
    pools only to be rediscovered if libvirtd restarts.
    
    So on failure, just fail and return as we should expect

Comment 7 yisun 2016-08-11 08:04:00 UTC
Verified on libvirt-2.0.0-5.el7.x86_64
PASSED


# lsscsi
...
[19:0:0:0]   disk    SanDisk  Cruzer Blade     1.26  /dev/sdh 

# virsh pool-build fs --overwrite
Pool fs built


# parted /dev/sdh p
Model: SanDisk Cruzer Blade (scsi)
Disk /dev/sdh: 8004MB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags: 

Number  Start  End     Size    File system  Flags
 1      0.00B  8004MB  8004MB  xfs


#  virsh pool-list --all --type fs
 Name                 State      Autostart 
-------------------------------------------
 fs                   inactive   no        

# virsh pool-edit fs
...
<format type='ext4'/>
...
Pool fs XML configuration edited.


# virsh pool-start fs --no-overwrite
error: Failed to start pool fs
error: Failed to make filesystem of type 'ext4' on device '/dev/sdh': Invalid argument


# virsh pool-list --type fs --all
 Name                 State      Autostart 
-------------------------------------------
 fs                   inactive   no        



# virsh pool-dumpxml fs
<pool type='fs'>
  <name>fs</name>
  <uuid>0ce8fc1e-f195-404b-bede-23b3b21ffcb3</uuid>
  <capacity unit='bytes'>0</capacity>
  <allocation unit='bytes'>0</allocation>
  <available unit='bytes'>0</available>
  <source>
    <device path='/dev/sdh'/>
    <format type='ext4'/>
  </source>
  <target>
    <path>/var/lib/libvirt/images/fs</path>
  </target>
</pool>

Comment 9 errata-xmlrpc 2016-11-03 18:51:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html