Bug 1223177

Summary: Metadata preallocation should not be supported for rbd based volumes
Product: Red Hat Enterprise Linux 7 Reporter: Yang Yang <yanyang>
Component: libvirtAssignee: Erik Skultety <eskultet>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: dyuan, eskultet, mzhan, rbalakri, shyu, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.2.17-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 06:31:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirt log none

Description Yang Yang 2015-05-20 03:03:11 UTC
Description of problem:
Following error out when creating rbd based volume with option prealloc-metadata. 
error: Failed to create vol yy3.img
error: failed to remove volume 'libvirt-pool/yy3.img': No such file or directory

Version-Release number of selected component (if applicable):
libvirt-1.2.15-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. define/start a rbd pool
# cat rbd-pool.xml
 <pool type='rbd'>
   <name>rbd</name>
   <source>
     <host name='10.66.5.219' port='6789'/>
     <host name='osd2.redhat.com' port='6789'/>
     <host name='osd3.redhat.com' port='6789'/>
     <name>libvirt-pool</name>
     <auth type='ceph' username='libvirt'>
       <secret usage='client.libvirt secret'/>
     </auth>
   </source>
 </pool>

 #virsh pool-define rbd-pool.xml
 #virsh pool-start rbd

2. create a rbd vol with option prealloc-metadata

 # virsh vol-create-as rbd yy3.img 1G --format raw --prealloc-metadata
 error: Failed to create vol yy3.img
 error: failed to remove volume 'libvirt-pool/yy3.img': No such file or directory

# virsh vol-list rbd
  Name                 Path                                    
 ------------------------------------------------------------------------------
  yy1.img              libvirt-pool/yy1.img                    
  yy2.img              libvirt-pool/yy2.img                    
  yy3.img              libvirt-pool/yy3.img

3.After pool-refresh, yy3.img disappears

# virsh pool-refresh rbd
Pool rbd refreshed

[root@rhel7_test ~]# virsh vol-list rbd
 Name                 Path                                    
------------------------------------------------------------------------------
 rbd1.img             libvirt-pool/rbd1.img                   
 yy1.img              libvirt-pool/yy1.img   

Actual results:


Expected results:
In step 2, libvirt should prevent preallocating metadata, and provide error like

unsupported configuration: metadata preallocation is not supported for raw volumes


Additional info:

Comment 2 Yang Yang 2015-05-29 07:13:48 UTC
Met the same issue when creating vol with backing file in rbd pool

steps

1. # cat BZ958510-RBD 
<volume type='network'>
  <name>demo</name>
  <source>
  </source>
<backingStore>
    <path>/var/lib/libvirt/images/vm1.raw</path>
  </backingStore>
</volume>

# virsh vol-create rbd BZ958510-RBD 
error: Failed to create vol from BZ958510-RBD
error: failed to remove volume 'libvirt-pool/demo': No such file or directory

2. # virsh vol-list rbd | grep demo
 demo                 libvirt-pool/demo 

3. # virsh vol-delete demo rbd
error: Failed to delete vol demo
error: failed to remove volume 'libvirt-pool/demo': No such file or directory

Comment 3 Erik Skultety 2015-06-02 13:21:52 UTC
Fixed upstream:

commit 4749d82a8bb92b908fe7f30038d8b1ea3390384d
Author: Erik Skultety <eskultet>
Date:   Thu May 28 17:00:01 2015 +0200

    storage: Don't update volume objs list before we successfully create one
    
    We do update pool volume object list before we actually create any
    volume. If buildVol fails, we then try to delete the volume in the
    storage as well as remove it from our structures. The problem is, that
    any backend that supports both buildVol and deleteVol would fail in this
    case which is completely unnecessary. This patch causes the update to
    take place after we know a volume has been created successfully, thus no
    removal in case of a buildVol failure is necessary.

v1.2.16-12-g4749d82

Comment 5 Yang Yang 2015-07-06 09:07:59 UTC
Hi Erik,

The patch introduces a regression. When I tried to create a huge volume, the volume created failed as expect, but the underlying file was not removed. I think we should delete underlying file when buildret failed.


Repro steps
1. Make sure haha.img does not exist
# virsh vol-list default| grep haha.img
# ll /var/lib/libvirt/images/haha.img
ls: cannot access /var/lib/libvirt/images/haha.img: No such file or directory

2. create haha.img in default pool with huge size
# virsh vol-create-as default haha.img 100000G
error: Failed to create vol haha.img
error: cannot allocate 107374182400000 bytes in file '/var/lib/libvirt/images/haha.img': No space left on device

3. check if haha.img file exists
# virsh vol-list default| grep haha.img
# ll /var/lib/libvirt/images/haha.img
-rw-------. 1 root root 107374182400000 Jul  6 16:35 /var/lib/libvirt/images/haha.img

Regards
Yang

Comment 6 Erik Skultety 2015-07-09 08:07:55 UTC
I had a look at this, you're right, the patch above caused a regression in cleanup phase when volume create/build fails. I think you should file a new bug as the regression occurs in a different storage backend.

Comment 7 Erik Skultety 2015-07-09 12:54:38 UTC
As I stated in BZ 1241454 comment #2 commit message, it was necessary to revert the commit 4749d82a (comment #3 above) completely, however

commit c8be606baec0a4fb477d1385454cf10ce15b061c
Author: Erik Skultety <eskultet>
Date:   Thu May 28 17:14:47 2015 +0200

    storage: RBD: do not return error when deleting non-existent volume
    
    RBD API returns negative value of errno, in that case we can silently
    ignore if RBD tries to delete a non-existent volume, just like FS
    backend does.


v1.2.16-13-gc8be606  also fixes this RBD backend related issue

Comment 8 Yang Yang 2015-07-13 10:00:42 UTC
Erik,

Unknown error out when rbd volume create/build fails

# virsh vol-create-as rbd yy3.img 100M --format raw --prealloc-metadata
error: Failed to create vol yy3.img
error: An error occurred, but the cause is unknown

2015-07-13 09:56:32.057+0000: 5475: debug : virStorageBackendRBDBuildVol:508 : Creating RBD image libvirt-pool/yy3.img with size 104857600
2015-07-13 09:56:32.057+0000: 5475: error : virStorageBackendRBDBuildVol:510 : unsupported flags (0x1) in function virStorageBackendRBDBuildVol
2015-07-13 09:56:32.057+0000: 5475: debug : virStorageBackendRBDDeleteVol:426 : Removing RBD image libvirt-pool/yy3.img
2015-07-13 09:56:32.057+0000: 5475: debug : virStorageBackendRBDOpenRADOSConn:71 : Using cephx authorization, username: libvirt
2015-07-13 09:56:32.057+0000: 5475: debug : virStorageBackendRBDOpenRADOSConn:91 : Looking up secret by usage: client.libvirt secret
2015-07-13 09:56:32.057+0000: 5475: debug : virSecretLookupByUsage:294 : conn=0x7fd7e40012e0, usageType=2 usageID=client.libvirt secret
2015-07-13 09:56:32.057+0000: 5475: info : virObjectRef:296 : OBJECT_REF: obj=0x7fd80adbebb0

Comment 9 Erik Skultety 2015-07-13 12:33:13 UTC
I tested this on my local fedora 20 machine with downstream build as well as on a 7.1 server with fresh libvirt 1.2.17-2 packages. The usecase above still reported the correct error "unsupported flags (0x1) in function virStorageBackendRBDBuildVol"

Comment 10 Yang Yang 2015-07-14 02:05:46 UTC
It's weird. I can always reproduce it with libvirt-1.2.17-2.el7.x86_64. Attached complete log.

Comment 11 Yang Yang 2015-07-14 02:06:51 UTC
Created attachment 1051589 [details]
libvirt log

Comment 12 Erik Skultety 2015-07-14 11:53:40 UTC
Well, this one here, was a little tricky to find. There's problem in code flow in our RBD backend where each and every operation with the pool needs to open a completely new connection. And there it is. You use an authentication method via secret, I don't (I configured a minimalistic ceph server only). Opening authenticated connection requires secret lookup ==> new operations tend to reset last error. That's it, daemon logs a correct error, however the error will be reset later and just before the result is packed and sent back to the client via RPC. You should file a new bug for this.

Comment 13 Yang Yang 2015-07-16 08:42:11 UTC
Verified with libvirt-1.2.17-2.el7.x86_64
Note:
Verify it only with disabled authentication. There is a bug 1243202 with enabled authentication.

Steps
1. define/start a rbd pool
# cat rbd-pool.xml
 <pool type='rbd'>
   <name>rbd</name>
   <source>
     <host name='10.66.5.219' port='6789'/>
     <host name='osd2.redhat.com' port='6789'/>
     <host name='osd3.redhat.com' port='6789'/>
     <name>libvirt-pool</name>
   </source>
 </pool>

 #virsh pool-define rbd-pool.xml
 #virsh pool-start rbd

2. # virsh vol-create-as rbd yy3.img 100M --format raw --prealloc-metadata
error: Failed to create vol yy3.img
error: unsupported flags (0x1) in function virStorageBackendRBDBuildVol

3. # virsh vol-create-as rbd yy3.img 100M --format raw --backing-vol /var/lib/libvirt/images/vm1.raw 
Vol yy3.img created

Although vol is created without error when specifying backing vol, actually backing vol does not take effect. It will disappare after refreshing pool. I think it's acceptable result.

Comment 15 errata-xmlrpc 2015-11-19 06:31:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html