Created attachment 1256941 [details] engine, vdsm and supervdsm logs Description of problem: Trying to sparsify a thin provisioned disk on a local file storage fails, with the following error in vdsm: 2017-02-23 14:35:07,044+0200 ERROR (tasks/6) [storage.guarded] Error acquiring lock <VolumeLease ns=04_lease_b604d887-d9db-42ab-9f5c-abaaaf645ad3, name=32067bd7-5659-4e0d-a2c8-5eeeedebfe21, mode=exclusive at 0x37282d0> (guarded:96) 2017-02-23 14:35:07,044+0200 ERROR (tasks/6) [root] Job u'2f49f97d-22d5-4d40-904d-43b2912fe72f' failed (jobs:217) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 154, in run self._run() File "/usr/share/vdsm/storage/sdm/api/sparsify_volume.py", line 53, in _run with guarded.context(self._vol_info.locks): File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__ six.reraise(*exc) File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__ lock.acquire() File "/usr/share/vdsm/storage/volume.py", line 1392, in acquire dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id) File "/usr/share/vdsm/storage/sd.py", line 471, in acquireVolumeLease self._domainLock.acquire(hostId, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 490, in acquire raise MultipleLeasesNotSupported("acquire", lease) MultipleLeasesNotSupported: Mulitple leases not supported, cannot acquire Lease(name=None, path=None, offset=None) I am not 100% that it should have worked in the first place, but as far as I understand it should, and it at least shouldn't fail like this. Version-Release number of selected component (if applicable): rhevm-4.1.1.2-0.1.el7.noarch Host: vdsm-4.19.6-1.el7ev.x86_64 libvirt-client-2.0.0-10.el7_3.5.x86_64 libvirt-daemon-2.0.0-10.el7_3.5.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64 libvirt-daemon-driver-qemu-2.0.0-10.el7_3.5.x86_64 qemu-kvm-tools-rhev-2.6.0-28.el7_3.6.x86_64 ipxe-roms-qemu-20160127-5.git6366fa7a.el7.noarch qemu-kvm-common-rhev-2.6.0-28.el7_3.6.x86_64 qemu-img-rhev-2.6.0-28.el7_3.6.x86_64 How reproducible: so far always Steps to Reproduce: 1. Add a host to engine. 2. Create some folder in the host and set chown -R 36:36 and chmod -R 0755 for it. 3. Put the host in maintenance and choose under the Management drop down 'Configure Local Storage'. 4. specify the path to the folder you created. 5. A new dc-cluster is created for that host with the local folder as storage domain. 6. In my case I imported as template an image (that I use in other clusters) from our glance instance, I assume this is the same for images of other sources. 7. I create a vm from the template with think provisioned, qcow2 disk. 8. Start the vm (which is running rhel-7.3) and created a 300MB file and then deleted it. 9. Stopped the vm. 10. Under the vm's disk's sub tab I choose the sparsify action. Actual results: Sparsify action fails, didn't seem to actually run the cmd. Expected results: Sparsify action works fine. Additional info:
Tal, this is a vdsm bug - the new storage jobs are not compatible with local DC. This is probably an issue only for sparsify, since we probably do move or copy disks in local DC.
(In reply to sefi litmanovich from comment #0) > MultipleLeasesNotSupported: Mulitple leases not supported, cannot > acquire Lease(name=None, path=None, offset=None) > > I am not 100% that it should have worked in the first place, but as far as I > understand it should, and it at least shouldn't fail like this. The issue is we don't use sanlock on local DC, and the special local lock for localfs storage domain does not support multiple leases. Storage jobs should work without leases on localfs storage domain, or we should add support for multiple leases. The purpose of leases in regular storage jobs is to protect volumes from concurrent access from multiple hosts, and allow failover when a host becomes non-responsive. With localfs storage domain, only one host can access the storage, so the vdsm internal locks are enough, and we cannot failover using other hosts, since only one host can see this storage. This needs a design, for now we cannot support sparsify on local DC, so this should be disabled in engine until vdsm can support this.
Really? so you say local storage design is way easier, requires no external locking, but yet this feature won't work. what additional "design" is needed? Just run virt-sparsify when triggered by a user? Sorry, but I'm really disappointed by the obvious fact that local storage is a second class citizen in ovirt.
Unfortunately yes, we have some limitations to sparsify and local storage is one of them, we should design it and work on it, for now I'll open another bug on blocking sparsify on local storage so the action will fail gracefully
This bug will be used for letting sparsify fail gracefully on local storage, Michal I think we should have an RFE for making sparsify work on local storage as well?
The complexity of locking at host level was strongly pushed by storage team. I'd be more than happy to remove that, we shouldn't have done that in the first place. Or are you suggesting an exception for local storage in the code?
(In reply to Tal Nisan from comment #5) > This bug will be used for letting sparsify fail gracefully on local storage, > Michal I think we should have an RFE for making sparsify work on local > storage as well? The fix suggested in this BZ was reverted, and a "proper" fix to allow sparsify, and other HSM jobs, to run on local storage was merged (see bug 1432081 for details). I'm returning the bug to MODIFIED, as it should only be verified when the next build, containing the fix for bug 1432081, is merged.
(In reply to Allon Mureinik from comment #9) > (In reply to Tal Nisan from comment #5) > > This bug will be used for letting sparsify fail gracefully on local storage, > > Michal I think we should have an RFE for making sparsify work on local > > storage as well? > > The fix suggested in this BZ was reverted, and a "proper" fix to allow > sparsify, and other HSM jobs, to run on local storage was merged (see bug > 1432081 for details). > > I'm returning the bug to MODIFIED, as it should only be verified when the > next build, containing the fix for bug 1432081, is merged. IIUC this means sparsify will work with BZ 1432081 merged? If this is correct, the title of this BZ should be changed, shouldn't it?
(In reply to Sven Kieske from comment #10) > (In reply to Allon Mureinik from comment #9) > > (In reply to Tal Nisan from comment #5) > > > This bug will be used for letting sparsify fail gracefully on local storage, > > > Michal I think we should have an RFE for making sparsify work on local > > > storage as well? > > > > The fix suggested in this BZ was reverted, and a "proper" fix to allow > > sparsify, and other HSM jobs, to run on local storage was merged (see bug > > 1432081 for details). > > > > I'm returning the bug to MODIFIED, as it should only be verified when the > > next build, containing the fix for bug 1432081, is merged. > > IIUC this means sparsify will work with BZ 1432081 merged? > > If this is correct, the title of this BZ should be changed, shouldn't it? Yes - just like any other file based storage. I've edited the title accordingly.
Verified with rhv-4.1.1.6-0.1.el7