Bug 1417458 - Cold Merge: Use volume generation
Summary: Cold Merge: Use volume generation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.1
: 4.1.1
Assignee: Ala Hino
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-29 15:10 UTC by Ala Hino
Modified: 2017-04-21 09:48 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Host becomes non-responsive Consequence: Unable to determine status of merge job Fix: Use generation Result: When host becomes non-responsive, engine attempts to fence the job, i.e. the host will fail to execute the job if attempts to. When trying to execute the job on a different host, there are two options: - Job completed on the previous host. This attempt fails (as expected) because the generation will be different on the engine and the host (generation on host incremented because the job successfully completed) - Job failed on the previous host. This attempt succeeds because generation on engine equals generation on the host
Clone Of:
Environment:
Last Closed: 2017-04-21 09:48:38 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 71347 0 ovirt-engine-4.1 MERGED Cold Merge: Use volume generation 2017-01-30 11:15:02 UTC

Description Ala Hino 2017-01-29 15:10:26 UTC
Description of problem:
Generation support is used to enhance error handling for jobs on
non-responsive hosts and decide about job status: started, didn't start,
failed or completed. Based on generation and volume lease, we could
decide whether to fence the job.

Steps to Reproduce:
1. Start cold merge
2. Stop Vdsm during merge step (watch the log to see when merge starts)
3. Try again

Comment 1 Kevin Alon Goldblatt 2017-02-16 13:41:45 UTC
Tested with the following code:
-----------------------------------------------
ovirt-engine-4.1.1-0.1.el7.noarch
rhevm-4.1.1-0.1.el7.noarch
vdsm-4.19.5-1.el7ev.x86_64

Verified with the following scenario:
----------------------------------------------
Create VM with disks on system with 2 hosts
Stop the VM
Start a cold merge and stop the vdsm on the Performing HSM during the cold merge
The 2nd HSM continues the job successfully
Start the previously stopped vdsm on second host >>>>> no attempt is made to continue the previous job

Moving to VERIFIED!


Note You need to log in before you can comment on or make changes to this bug.