Bug 1410036 - Xlease volume does not exist in all storage domains in the DC
Summary: Xlease volume does not exist in all storage domains in the DC
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.19.1
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Nir Soffer
QA Contact: Raz Tamir
URL:
Whiteboard:
: 1415723 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-04 09:59 UTC by Lilach Zitnitski
Modified: 2022-06-30 13:01 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-25 11:39:17 UTC
oVirt Team: Storage
Embargoed:
sbonazzo: ovirt-4.1-
sbonazzo: blocker-


Attachments (Terms of Use)
logs zip (31.30 KB, application/zip)
2017-01-04 10:00 UTC, Lilach Zitnitski
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-46842 0 None None None 2022-06-30 13:01:27 UTC

Description Lilach Zitnitski 2017-01-04 09:59:20 UTC
Description of problem:
Though all of the storage domains in my environment are v4, not all of them have the xleases volume. 
Therefore, vm lease cannot be created on them, even though they are enabled when selecting sd to place the lease. 
When selecting storage domain without the xleases volume, no error is shown, and from the user point of view the vm was updated successfully with the new lease, but the vm can't start without this lease now. 

Version-Release number of selected component (if applicable):
vdsm-4.19.1-17.gitf1272bf.el7.centos.x86_64
ovirt-engine-4.1.0-0.4.master.20170103091953.gitfaae662.el7.centos.noarch

How reproducible:
100%

Steps to Reproduce:
1. regarding the storage domains, I just added them and some created xleases and some did not.
2. under /rhev/data_center/mnt/... check which storage domains have xlease and which not
2. create new vm and configure high availability and choose storage domain without xleases from the drop down menu.
3. start the vm

Actual results:
no error is shows even though xleases is not found in this storage domain, and vm fails to start.

Expected results:
perfect scenario is when all SD's have xleases, but if some doesn't, at least some error should appear to warn the user the lease was not created. 

Additional info:

ovirt-engine
2017-01-04 11:38:44,911+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-12) [7344f5c7] Correlation ID: 4900870a-0259-46d5-a04b-9cef0e096574, Job ID: 40b0820f-ab2f-40ac-82d9-4aa2149bbbd7, Call Stack: null, Custom Event ID: -1, Message: VM failover-test1 configuration was updated by admin@internal-authz

|-- 6ef7ff1f-eb07-41fe-b843-d179a5b71a37 -> /rhev/data-center/mnt/glusterSD/10.35.65.25:_lilach__data/6ef7ff1f-eb07-41fe-b843-d179a5b71a37
|   |-- dom_md
|   |   |-- ids
|   |   |-- inbox
|   |   |-- leases
|   |   |-- metadata
|   |   `-- outbox

|-- 83dda8e2-f882-4d67-bd4c-77eca35fbb22 -> /rhev/data-center/mnt/10.35.110.11:_Storage__NFS_lilah__export_data__nfs2/83dda8e2-f882-4d67-bd4c-77eca35fbb22
|   |-- dom_md
|   |   |-- ids
|   |   |-- inbox
|   |   |-- leases
|   |   |-- metadata
|   |   `-- outbox

|-- b5a6e503-33f5-470f-9bbe-9121d00ba981 -> /rhev/data-center/mnt/10.35.110.11:_Storage__NFS_lilah__export_data__nfs1/b5a6e503-33f5-470f-9bbe-9121d00ba981
|   |-- dom_md
|   |   |-- ids
|   |   |-- inbox
|   |   |-- leases
|   |   |-- metadata
|   |   |-- outbox
|   |   `-- xleases

|-- e439ba21-a7f9-43f1-839b-9fcd7cc05c2d -> /rhev/data-center/mnt/blockSD/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d
|   |-- dom_md
|   |   |-- ids -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/ids
|   |   |-- inbox -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/inbox
|   |   |-- leases -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/leases
|   |   |-- master -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/master
|   |   |-- metadata -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/metadata
|   |   |-- outbox -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/outbox
|   |   `-- xleases -> /dev/e439ba21-a7f9-43f1-839b-9fcd7cc05c2d/xleases

Comment 1 Lilach Zitnitski 2017-01-04 10:00:12 UTC
Created attachment 1237084 [details]
logs zip

engine log
vdsm log

Comment 2 Nir Soffer 2017-01-09 11:01:45 UTC
Storage domain created with development version of vdsm that did not supported 
external leases are not supported.

You can create the missing xleases volumes using vdsm-tool, using this patch:
https://gerrit.ovirt.org/#/c/69069/

I suggest to wait with this bug until the patches land in 4.1. This may be
interesting use case for support, recovering from disaster by creating a new 
xleases volume.

Comment 3 Tal Nisan 2017-01-15 09:33:20 UTC
Targeting to 4.1 beta, to be revisited when all the patches are in

Comment 4 Fred Rolland 2017-01-23 16:51:58 UTC
*** Bug 1415723 has been marked as a duplicate of this bug. ***

Comment 5 Allon Mureinik 2017-01-23 20:25:45 UTC
Lilach - just to make sure:
- are ALL your VDSM hosts the same version? 
- are ALL these domains newly created, or do you have some preexisting domains before upgrading VDSM?

Comment 6 Allon Mureinik 2017-01-23 20:28:10 UTC
(In reply to Nir Soffer from comment #2)
> Storage domain created with development version of vdsm that did not
> supported 
> external leases are not supported.
> 
> You can create the missing xleases volumes using vdsm-tool, using this patch:
> https://gerrit.ovirt.org/#/c/69069/
> 
> I suggest to wait with this bug until the patches land in 4.1. This may be
> interesting use case for support, recovering from disaster by creating a new 
> xleases volume.

Having a way to use vdsm-tool to recover is great, but we need to address the core issue.
If we're creating domains without the xlease volume, that's a bug, and we need to fix it.
If this is indeed an upgrade-between-builds issue, it can probably have it's priority reduced, or even closed - let's wait for Lilach's feedback on comment 5 before we decide how to proceed.

Comment 7 Lilach Zitnitski 2017-01-24 08:02:07 UTC
(In reply to Allon Mureinik from comment #5)
> Lilach - just to make sure:
> - are ALL your VDSM hosts the same version? 
> - are ALL these domains newly created, or do you have some preexisting
> domains before upgrading VDSM?

Yes, all of the hosts in the env had the same vdsm version.
About the storage domains, some were imported and some were newly created. For example,  export_data__nfs2 was a new domain and it was created without the xleases volume. 
I have to add, since then my hosts and engine were upgraded few times and the bug didn't reproduce.

Comment 8 guy chen 2017-01-24 09:35:39 UTC
Advise that this was reproduced also at Nisim and my system upgrading 4.1 downstream versions from build 4 to build 7, at duplicate bug 1415723 there is log of the VDSM if needed.

Comment 9 Yaniv Kaul 2017-01-25 11:39:17 UTC
Closing for the time being, as this was only found in development versions (in between) and will not happen in real environment (QE: please verify it doesn't happen from 4.0 to 4.1 beta, for example!)

Comment 10 guy chen 2017-01-25 13:58:45 UTC
Following fixing my system with Nir this is another procedure that can fix the issue :

put SPM host on maintenance
stop vdsmd
identify your currant storage domain VG id that is missing xleases lv using lvs
add a new storage domain
create xleases lv on the original storage domain :
lvcreate -n xleases -L 1g {original VG id}
copy new VG xleases to riginal VG xleases :
time dd if=/dev/{new VG id}/xleases of=/dev/{original VG id}/xleases bs=8M oflag=direct
start vdsmd
activate host

Comment 11 Raz Tamir 2017-01-29 14:22:00 UTC
Yaniv,
Is important to distinguish between the issue that cause Lilach to open the bug and the issue caused by the upgrade process.

The former, is not related to upgrade a in between versions.

For now we will keep this bug closed until we will face this issue again


Note You need to log in before you can comment on or make changes to this bug.