Bug 1408825 - [VDSM] Add the ability to create and remove lease while vm is up
Summary: [VDSM] Add the ability to create and remove lease while vm is up
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: RFEs
Version: 4.19.20
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.1.3
: ---
Assignee: Nir Soffer
QA Contact: Lilach Zitnitski
URL:
Whiteboard:
Depends On: 1406765 1415488
Blocks: 1453163
TreeView+ depends on / blocked
 
Reported: 2016-12-27 13:18 UTC by Lilach Zitnitski
Modified: 2017-07-06 14:05 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
With this update, the ability to unplug a lease from and plug another lease into a running virtual machine has been added using new APIs. This provides the ability to move a virtual machine lease from one storage domain to another so that the original storage domain can be placed into maintenance.
Clone Of:
Environment:
Last Closed: 2017-07-06 14:05:16 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
ratamir: testing_plan_complete-
ylavi: planning_ack+
rule-engine: devel_ack+
ratamir: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 69335 0 master MERGED vm: Add or remove a lease when a VM is running 2021-02-15 09:20:33 UTC
oVirt gerrit 69350 0 master MERGED exception: Add ContextException 2021-02-15 09:20:33 UTC
oVirt gerrit 69351 0 master MERGED vm: Log device __repr__ instead of name 2021-02-15 09:20:34 UTC
oVirt gerrit 69352 0 master MERGED lease: Add lease.find_device helper 2021-02-15 09:20:34 UTC
oVirt gerrit 69353 0 master MERGED lease: Implement lease.Device.is_attached_to 2021-02-15 09:20:34 UTC
oVirt gerrit 69401 0 master MERGED exception: Add context to some virt errors 2021-02-15 09:20:34 UTC
oVirt gerrit 69743 0 master MERGED lease: Add lease.find_conf helper 2021-02-15 09:20:34 UTC
oVirt gerrit 71648 0 ovirt-4.1 MERGED exception: Add ContextException 2021-02-15 09:20:34 UTC
oVirt gerrit 71649 0 ovirt-4.1 MERGED exception: Add context to some virt errors 2021-02-15 09:20:35 UTC
oVirt gerrit 71650 0 ovirt-4.1 MERGED vm: Log device __repr__ instead of name 2021-02-15 09:20:35 UTC
oVirt gerrit 71651 0 ovirt-4.1 MERGED lease: Check for six.PY2 instead of six.PY3 2021-02-15 09:20:35 UTC
oVirt gerrit 71652 0 ovirt-4.1 MERGED lease: Add lease.find_device helper 2021-02-15 09:20:35 UTC
oVirt gerrit 71653 0 ovirt-4.1 MERGED lease: Implement lease.Device.is_attached_to 2021-02-15 09:20:36 UTC
oVirt gerrit 71654 0 ovirt-4.1 MERGED lease: Add lease.find_conf helper 2021-02-15 09:20:36 UTC
oVirt gerrit 71655 0 ovirt-4.1 MERGED vm: Add or remove a lease when a VM is running 2021-02-15 09:20:35 UTC

Description Lilach Zitnitski 2016-12-27 13:18:41 UTC
Description of problem:
In the current state, if user wants to create a new lease or remove existing one from a certain vm, he needs to power-off the vm. 
It makes it hard to keep the high availability of the vm in scenarios like Live Storage Migration for instance. 
Therefore, creating and removing a lease should be enabled even if the vm is up, this will also help migrating the lease between storage domains while vm is up.

Comment 1 Tal Nisan 2016-12-28 09:58:54 UTC
Pending some research, targeting in the meanwhile to 4.1 to keep it on our radar

Comment 2 Nir Soffer 2017-01-01 01:51:47 UTC
The attached patches are fro the vdsm side.

Tal: I think we need a new bug for the engine side and maybe a tracker bug
linking to both bugs.

Comment 3 Nir Soffer 2017-01-01 20:39:12 UTC
How to test the vdsm side

Install vdsm-client:

    yum install vdsm-client 

- default lease, created using the vm dialog
- temp lease created manually using vdsm-client

Creating the temp lease (run this on the SPM):

# echo '{"lease": {"sd_id": "7df95b16-1bd3-4c23-bbbe-b21d403bdcd8", "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89"}}' | vdsm-client -f - Lease create
"75cc6395-6fb9-4143-9a95-988238b44130"

Checking the temp lease info (needed for preparing a json file for hotplug):

# echo '{"lease": {"sd_id": "7df95b16-1bd3-4c23-bbbe-b21d403bdcd8", "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89"}}' | vdsm-client -f - Lease info
{
    "path": "/dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/xleases",
    "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
    "sd_id": "7df95b16-1bd3-4c23-bbbe-b21d403bdcd8",
    "offset": 3145728
}

Clear the spm task:

# vdsm-client Task clear taskID=75cc6395-6fb9-4143-9a95-988238b44130

Default lease json file:

# cat /root/default-lease.json
{
    "vmID": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
    "lease": {
        "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
        "sd_id": "40825394-bb03-4b66-8a82-e6ddbb789ec3",
        "type": "lease"
    }
}

Temp lease json file:

# cat /root/temp-lease.json
{
    "vmID": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
    "lease": {
        "sd_id": "7df95b16-1bd3-4c23-bbbe-b21d403bdcd8",
        "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
        "type": "lease"
    }
}

Before plugging lease:

# sanlock client status
daemon 39a2e86d-08db-47ce-a973-ed2058103b7d.voodoo6.tl
p -1 helper
p -1 listener
p 7611 ha1
p -1 status
s 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8:1:/dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/ids:0
s c86e8dab-444b-4c66-b004-f1768f12219b:1:/rhev/data-center/mnt/dumbo.tlv.redhat.com\:_export_voodoo_01/c86e8dab-444b-4c66-b004-f1768f12219b/dom_md/ids:0
s 40825394-bb03-4b66-8a82-e6ddbb789ec3:1:/dev/40825394-bb03-4b66-8a82-e6ddbb789ec3/ids:0
r 40825394-bb03-4b66-8a82-e6ddbb789ec3:952f9034-77da-40f4-9ba1-e3356c3f3e89:/dev/40825394-bb03-4b66-8a82-e6ddbb789ec3/xleases:4194304:27 p 7611

Hot plugging temp lease to vm:

# vdsm-client VM hotplugLease -f /root/temp-lease.json
[vm xml snipped]

After plugging temp lease, vm has 2 leases:

# sanlock client status
daemon 39a2e86d-08db-47ce-a973-ed2058103b7d.voodoo6.tl
p -1 helper
p -1 listener
p 7611 ha1
p -1 status
s 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8:1:/dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/ids:0
s c86e8dab-444b-4c66-b004-f1768f12219b:1:/rhev/data-center/mnt/dumbo.tlv.redhat.com\:_export_voodoo_01/c86e8dab-444b-4c66-b004-f1768f12219b/dom_md/ids:0
s 40825394-bb03-4b66-8a82-e6ddbb789ec3:1:/dev/40825394-bb03-4b66-8a82-e6ddbb789ec3/ids:0
r 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8:952f9034-77da-40f4-9ba1-e3356c3f3e89:/dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/xleases:3145728:12 p 7611
r 40825394-bb03-4b66-8a82-e6ddbb789ec3:952f9034-77da-40f4-9ba1-e3356c3f3e89:/dev/40825394-bb03-4b66-8a82-e6ddbb789ec3/xleases:4194304:27 p 7611

Unplugging default lease from vm:

# vdsm-client VM hotunplugLease -f /root/default-lease.json
[vm xml snipped]

After unplugging default lease, vm hold only temp lease.

# sanlock client status
daemon 39a2e86d-08db-47ce-a973-ed2058103b7d.voodoo6.tl
p -1 helper
p -1 listener
p 7611 ha1
p -1 status
s 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8:1:/dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/ids:0
s c86e8dab-444b-4c66-b004-f1768f12219b:1:/rhev/data-center/mnt/dumbo.tlv.redhat.com\:_export_voodoo_01/c86e8dab-444b-4c66-b004-f1768f12219b/dom_md/ids:0
s 40825394-bb03-4b66-8a82-e6ddbb789ec3:1:/dev/40825394-bb03-4b66-8a82-e6ddbb789ec3/ids:0
r 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8:952f9034-77da-40f4-9ba1-e3356c3f3e89:/dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/xleases:3145728:12 p 7611

At this point domain 40825394-bb03-4b66-8a82-e6ddbb789ec3 can be put
to maintenance (assuming that the vm does not have a disk on this domain).

Comment 4 Nir Soffer 2017-01-01 21:13:46 UTC
You must use libvirt >= 2.0.0-10.el7_3.3 to use leases, see bug 1403691.

Comment 5 Sandro Bonazzola 2017-02-01 16:01:30 UTC
oVirt 4.1.0 GA has been released, re-targeting to 4.1.1.
Please check if this issue is correctly targeted or already included in 4.1.0.

Comment 6 Nir Soffer 2017-02-16 14:37:28 UTC
Only the vdsm side is merged, we are waiting for the engine side.

We can verify this bug if we create another bug for the engine side.

Comment 7 Red Hat Bugzilla Rules Engine 2017-02-16 14:37:33 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 8 Yaniv Lavi 2017-03-16 11:37:29 UTC
What is the status of the engine side patches?

Comment 9 Yaniv Kaul 2017-03-19 08:50:28 UTC
Missed 4.1.1, moving to 4.1.2 - perhaps it can make it to 4.1.1-1.

Comment 10 Tal Nisan 2017-03-20 10:36:46 UTC
Haven't started working on them yet, Nir has made an effort to push the VDSM work to 4.1.1 but we've agreed that the full feature will be supported only in 4.1.z as the Engine work is more complicated due to the handling of many corner cases.

Comment 11 Yaniv Lavi 2017-04-05 08:34:17 UTC
We will make a simple implementation for 4.1.z allowing to remove a lease or add it, but not the more complex orchestrated flow of switching from one lease to another.

Comment 16 Lilach Zitnitski 2017-05-29 07:38:37 UTC
tested with:
rhevm-4.1.3-0.1.el7.noarch
vdsm-4.19.16-1.el7ev.x86_64
libvirt-2.0.0-10.el7_3.5.x86_64

Steps to Reproduce:
1. On SPM host create temporary lease using:
# echo '{"lease": {"sd_id": "[sd_id for new lease]", "lease_id": "[vm_id]"}}' | vdsm-client -f - Lease create
"75cc6395-6fb9-4143-9a95-988238b44130" (task id)
vdsm-client Task clear taskID=[task_id]
2. On the host running the vm create two files:
default-lease.json (with the current lease information):
{
    "vmID": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
    "lease": {
        "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
        "sd_id": "40825394-bb03-4b66-8a82-e6ddbb789ec3",
        "type": "lease"
    }
}

temp-lease.json (with the new lease information):
{
    "vmID": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
    "lease": {
        "sd_id": "7df95b16-1bd3-4c23-bbbe-b21d403bdcd8",
        "lease_id": "952f9034-77da-40f4-9ba1-e3356c3f3e89",
        "type": "lease"
    }
}
3. vdsm-client VM hotplugLease -f [path to temp_lease.json]
sanlock client status - should show the vm has now 2 leases
4. vdsm-client VM hotunplugLease -f [path to default-lease.json]
sanlock client status - should show the new lease only 
5. try to move the storage domain with the old lease to maintenance

Actual results:
both on the UI and the REST API the vm still has the lease on the first storage domain. Also this storage domain fails to be moved to maintenance with the following error:

Error while executing action: Cannot deactivate domain, the domain contains leases for the following running VMs: vm_lease_test_0

In the host running the vm sanlock client status shows correct information 

before making any changes:

[root@storage-ge7-vdsm1 ~]# sanlock client status
daemon c517ab4d-4a9b-41d5-8171-d4be4222e934.storage-ge

r 71c429f0-beb4-4aa3-b1ec-bf02b7a630f5:8dffe84f-8080-4479-87a6-8a6cd9422f7c:/rhev/data-center/mnt/glusterSD/gluster-server01.qa.lab.tlv.redhat.com\:_storage__local__ge7__volume__2/71c429f0-beb4-4aa3-b1ec-bf02b7a630f5/dom_md/xleases:3145728:1 p 32737

after temp lease hotplug:

[root@storage-ge7-vdsm1 ~]# sanlock client status
daemon c517ab4d-4a9b-41d5-8171-d4be4222e934.storage-ge

r b77c9b99-15ed-4bec-96ec-7b336f989891:8dffe84f-8080-4479-87a6-8a6cd9422f7c:/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com\:_Storage__NFS_storage__local__ge7__nfs__0/b77c9b99-15ed-4bec-96ec-7b336f989891/dom_md/xleases:3145728:1 p 32737
r 71c429f0-beb4-4aa3-b1ec-bf02b7a630f5:8dffe84f-8080-4479-87a6-8a6cd9422f7c:/rhev/data-center/mnt/glusterSD/gluster-server01.qa.lab.tlv.redhat.com\:_storage__local__ge7__volume__2/71c429f0-beb4-4aa3-b1ec-bf02b7a630f5/dom_md/xleases:3145728:1 p 32737

after removing first lease:

[root@storage-ge7-vdsm1 ~]# sanlock client status

r b77c9b99-15ed-4bec-96ec-7b336f989891:8dffe84f-8080-4479-87a6-8a6cd9422f7c:/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com\:_Storage__NFS_storage__local__ge7__nfs__0/b77c9b99-15ed-4bec-96ec-7b336f989891/dom_md/xleases:3145728:1 p 32737

moving to assigned since the storage domain still cannot be removed or moved to maintenance.

Comment 17 Tal Nisan 2017-05-29 12:48:51 UTC
Lilach you were testing moving to maintenance on the Engine side which is expected to fail as Engine is not aware of the lease hot plug you've ran directly in VDSM, you should follow Nir's steps to test in comment #3 in order to test it

Comment 18 Lilach Zitnitski 2017-06-01 06:38:46 UTC
(In reply to Tal Nisan from comment #17)
> Lilach you were testing moving to maintenance on the Engine side which is
> expected to fail as Engine is not aware of the lease hot plug you've ran
> directly in VDSM, you should follow Nir's steps to test in comment #3 in
> order to test it

That's exactly what I did. 
So is there a way to make sure this operation succeeded or just by running 
# sanlock client status? since moving the storage domain to maintenance failed.

Comment 19 Tal Nisan 2017-06-01 08:55:28 UTC
Moving the domain to maintenance is expected to fail, the only way to check is through VDSM, Nir are there any other steps to verify from VDSM side or we're good?

Comment 20 Nir Soffer 2017-06-01 17:30:36 UTC
(In reply to Tal Nisan from comment #19)
> Moving the domain to maintenance is expected to fail, the only way to check
> is through VDSM, Nir are there any other steps to verify from VDSM side or
> we're good?

Testing is explained in comment 3. Using "sanlock client status" and the output
from the VM.hotplugLease and VM.hotunplugLease we can see that the old lease 
was unplugged and the new lease was plugged.

My note in comment 3 about putting the domain to maintenance was wrong, since we
moved the lease behind engine back. When this will be integrated in engine, and 
engine will unplug the old lease and plug the new lease, it will be possible to 
put the domain to maintenance.

Comment 21 Lilach Zitnitski 2017-06-04 08:16:40 UTC
(In reply to Nir Soffer from comment #20)
> (In reply to Tal Nisan from comment #19)
> > Moving the domain to maintenance is expected to fail, the only way to check
> > is through VDSM, Nir are there any other steps to verify from VDSM side or
> > we're good?
> 
> Testing is explained in comment 3. Using "sanlock client status" and the
> output
> from the VM.hotplugLease and VM.hotunplugLease we can see that the old lease 
> was unplugged and the new lease was plugged.
> 
> My note in comment 3 about putting the domain to maintenance was wrong,
> since we
> moved the lease behind engine back. When this will be integrated in engine,
> and 
> engine will unplug the old lease and plug the new lease, it will be possible
> to 
> put the domain to maintenance.

So based on comment #19, sanlock client status shows that hotplug and hotunplug of the lease were completed successfully. 
Moving to VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.