Bug 1426265 - Sparsify should work on local storage
Summary: Sparsify should work on local storage
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.1.1.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium with 1 vote
Target Milestone: ovirt-4.1.1
: 4.1.1.6
Assignee: Tal Nisan
QA Contact: sefi litmanovich
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-23 15:01 UTC by sefi litmanovich
Modified: 2017-04-21 09:35 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1432081 (view as bug list)
Environment:
Last Closed: 2017-04-21 09:35:24 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: exception+


Attachments (Terms of Use)
engine, vdsm and supervdsm logs (414.09 KB, application/x-gzip)
2017-02-23 15:01 UTC, sefi litmanovich
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 74057 0 master MERGED core: Prevert running sparsify on disks on local storage 2017-03-14 17:10:43 UTC
oVirt gerrit 74074 0 ovirt-engine-4.1 MERGED core: Prevert running sparsify on disks on local storage 2017-03-14 17:11:24 UTC
oVirt gerrit 74075 0 ovirt-engine-4.1.1.z MERGED core: Prevert running sparsify on disks on local storage 2017-03-14 17:12:01 UTC

Description sefi litmanovich 2017-02-23 15:01:01 UTC
Created attachment 1256941 [details]
engine, vdsm and supervdsm logs

Description of problem:
Trying to sparsify a thin provisioned disk on a local file storage fails, with the following error in vdsm:

    2017-02-23 14:35:07,044+0200 ERROR (tasks/6) [storage.guarded] Error acquiring lock <VolumeLease ns=04_lease_b604d887-d9db-42ab-9f5c-abaaaf645ad3, name=32067bd7-5659-4e0d-a2c8-5eeeedebfe21, mode=exclusive at 0x37282d0> (guarded:96)
    2017-02-23 14:35:07,044+0200 ERROR (tasks/6) [root] Job u'2f49f97d-22d5-4d40-904d-43b2912fe72f' failed (jobs:217)
    Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 154, in run
        self._run()
      File "/usr/share/vdsm/storage/sdm/api/sparsify_volume.py", line 53, in _run
        with guarded.context(self._vol_info.locks):
      File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__
        six.reraise(*exc)
      File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__
        lock.acquire()
      File "/usr/share/vdsm/storage/volume.py", line 1392, in acquire
        dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id)
      File "/usr/share/vdsm/storage/sd.py", line 471, in acquireVolumeLease
        self._domainLock.acquire(hostId, lease)
      File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 490, in acquire
        raise MultipleLeasesNotSupported("acquire", lease)
    MultipleLeasesNotSupported: Mulitple leases not supported, cannot acquire Lease(name=None, path=None, offset=None)

I am not 100% that it should have worked in the first place, but as far as I understand it should, and it at least shouldn't fail like this.



Version-Release number of selected component (if applicable):
rhevm-4.1.1.2-0.1.el7.noarch

Host:
vdsm-4.19.6-1.el7ev.x86_64
libvirt-client-2.0.0-10.el7_3.5.x86_64
libvirt-daemon-2.0.0-10.el7_3.5.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64
libvirt-daemon-driver-qemu-2.0.0-10.el7_3.5.x86_64
qemu-kvm-tools-rhev-2.6.0-28.el7_3.6.x86_64
ipxe-roms-qemu-20160127-5.git6366fa7a.el7.noarch
qemu-kvm-common-rhev-2.6.0-28.el7_3.6.x86_64
qemu-img-rhev-2.6.0-28.el7_3.6.x86_64

How reproducible:
so far always

Steps to Reproduce:
1. Add a host to engine.
2. Create some folder in the host and set chown -R 36:36 and chmod -R 0755 for it.
3. Put the host in maintenance and choose under the Management drop down 'Configure Local Storage'.
4. specify the path to the folder you created.
5. A new dc-cluster is created for that host with the local folder as storage domain.
6. In my case I imported as template an image (that I use in other clusters) from our glance instance, I assume this is the same for images of other sources.
7. I create a vm from the template with think provisioned, qcow2 disk.
8. Start the vm (which is running rhel-7.3) and created a 300MB file and then deleted it.
9. Stopped the vm.
10. Under the vm's disk's sub tab I choose the sparsify action.

Actual results:
Sparsify action fails, didn't seem to actually run the cmd.

Expected results:
Sparsify action works fine.

Additional info:

Comment 1 Nir Soffer 2017-02-26 14:51:38 UTC
Tal, this is a vdsm bug - the new storage jobs are not compatible with local DC.

This is probably an issue only for sparsify, since we probably do move or copy
disks in local DC.

Comment 2 Nir Soffer 2017-02-26 15:00:57 UTC
(In reply to sefi litmanovich from comment #0)
>     MultipleLeasesNotSupported: Mulitple leases not supported, cannot
> acquire Lease(name=None, path=None, offset=None)
> 
> I am not 100% that it should have worked in the first place, but as far as I
> understand it should, and it at least shouldn't fail like this.

The issue is we don't use sanlock on local DC, and the special local lock for
localfs storage domain does not support multiple leases.

Storage jobs should work without leases on localfs storage domain, or we should
add support for multiple leases.

The purpose of leases in regular storage jobs is to protect volumes from concurrent
access from multiple hosts, and allow failover when a host becomes non-responsive.
With localfs storage domain, only one host can access the storage, so the vdsm
internal locks are enough, and we cannot failover using other hosts, since only 
one host can see this storage.

This needs a design, for now we cannot support sparsify on local DC, so this
should be disabled in engine until vdsm can support this.

Comment 3 Sven Kieske 2017-02-27 10:18:14 UTC
Really?

so you say local storage design is way easier, requires no external locking, but yet this feature won't work.

what additional "design" is needed?

Just run virt-sparsify when triggered by a user?

Sorry, but I'm really disappointed by the obvious fact that local storage is a second class citizen in ovirt.

Comment 4 Tal Nisan 2017-03-02 07:24:59 UTC
Unfortunately yes, we have some limitations to sparsify and local storage is one of them, we should design it and work on it, for now I'll open another bug on blocking sparsify on local storage so the action will fail gracefully

Comment 5 Tal Nisan 2017-03-14 12:46:16 UTC
This bug will be used for letting sparsify fail gracefully on local storage, Michal I think we should have an RFE for making sparsify work on local storage as well?

Comment 6 Michal Skrivanek 2017-03-14 12:50:57 UTC
The complexity of locking at host level was strongly pushed by storage team. I'd be more than happy to remove that, we shouldn't have done that in the first place. Or are you suggesting an exception for local storage in the code?

Comment 9 Allon Mureinik 2017-03-21 10:05:27 UTC
(In reply to Tal Nisan from comment #5)
> This bug will be used for letting sparsify fail gracefully on local storage,
> Michal I think we should have an RFE for making sparsify work on local
> storage as well?

The fix suggested in this BZ was reverted, and a "proper" fix to allow sparsify, and other HSM jobs, to run on local storage was merged (see bug 1432081 for details).

I'm returning the bug to MODIFIED, as it should only be verified when the next build, containing the fix for bug 1432081, is merged.

Comment 10 Sven Kieske 2017-03-23 12:52:04 UTC
(In reply to Allon Mureinik from comment #9)
> (In reply to Tal Nisan from comment #5)
> > This bug will be used for letting sparsify fail gracefully on local storage,
> > Michal I think we should have an RFE for making sparsify work on local
> > storage as well?
> 
> The fix suggested in this BZ was reverted, and a "proper" fix to allow
> sparsify, and other HSM jobs, to run on local storage was merged (see bug
> 1432081 for details).
> 
> I'm returning the bug to MODIFIED, as it should only be verified when the
> next build, containing the fix for bug 1432081, is merged.

IIUC this means sparsify will work with BZ 1432081 merged?

If this is correct, the title of this BZ should be changed, shouldn't it?

Comment 11 Allon Mureinik 2017-03-28 11:52:52 UTC
(In reply to Sven Kieske from comment #10)
> (In reply to Allon Mureinik from comment #9)
> > (In reply to Tal Nisan from comment #5)
> > > This bug will be used for letting sparsify fail gracefully on local storage,
> > > Michal I think we should have an RFE for making sparsify work on local
> > > storage as well?
> > 
> > The fix suggested in this BZ was reverted, and a "proper" fix to allow
> > sparsify, and other HSM jobs, to run on local storage was merged (see bug
> > 1432081 for details).
> > 
> > I'm returning the bug to MODIFIED, as it should only be verified when the
> > next build, containing the fix for bug 1432081, is merged.
> 
> IIUC this means sparsify will work with BZ 1432081 merged?
> 
> If this is correct, the title of this BZ should be changed, shouldn't it?
Yes - just like any other file based storage.

I've edited the title accordingly.

Comment 12 sefi litmanovich 2017-03-28 15:08:32 UTC
Verified with rhv-4.1.1.6-0.1.el7


Note You need to log in before you can comment on or make changes to this bug.