Bug 1470055 - After delete snapshot the disk status was illegal
Summary: After delete snapshot the disk status was illegal
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: ---
Hardware: x86_64
OS: Linux
unspecified
urgent with 1 vote
Target Milestone: ovirt-4.2.0
: ---
Assignee: Ala Hino
QA Contact: Natalie Gavrielov
URL:
Whiteboard:
Depends On: 1508560
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-12 10:56 UTC by Massimo
Modified: 2018-01-12 12:57 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-01-12 12:57:19 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
Log Manager (84 bytes, text/plain)
2017-07-12 10:56 UTC, Massimo
no flags Details
Manager log (4.01 MB, text/plain)
2017-07-13 10:33 UTC, Massimo
no flags Details
SPM Owner (11.90 MB, text/plain)
2017-07-13 10:34 UTC, Massimo
no flags Details
Host where there is the VM (8.64 MB, text/plain)
2017-07-13 10:35 UTC, Massimo
no flags Details
Manager LOG (6.62 MB, text/plain)
2017-09-14 12:03 UTC, Massimo
no flags Details
VM Owner (2.33 MB, application/zip)
2017-09-14 12:08 UTC, Massimo
no flags Details
SPM Owner (2.58 MB, application/zip)
2017-09-14 12:09 UTC, Massimo
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 82528 0 None MERGED core: Retry failed live merge commands 2021-02-08 09:22:19 UTC

Description Massimo 2017-07-12 10:56:48 UTC
Created attachment 1296863 [details]
Log Manager

Hi,
I move a disks from one storage to another one, after that i try to remove the snapshot and now the status of the disksnapshot was illegal.
Ovirt manager 3.6.7
oVirt Node - 3.6 - 0.999.201608241021.el7.centos
I send the log of the manager and the owner of the SPM

Comment 1 Massimo 2017-07-12 11:01:26 UTC
How i can send the logs ?

Comment 2 Tal Nisan 2017-07-12 12:07:20 UTC
You can attach the logs in Bugzilla, please include the Engine logs and VDSM log of the SPM host for starters.
Given the fact that you are referring to a snapshot after a move I assume you are moving the disk while the VM is up, right?

Comment 3 Massimo 2017-07-12 14:24:08 UTC
Hi the logs are too big how i can send the logs.
The problem was that i have move the disk few days ago and in don't know if i ulpload the last vdsm log and engine logs you can see the problem.
I move the disk while the vm is up
Regards
Massimo

Comment 4 Tal Nisan 2017-07-12 15:19:11 UTC
Ala, this reminds me of a bug we've solved in 3.6.z, can you please confirm?

Comment 5 Massimo 2017-07-12 15:39:24 UTC
I promise that i upgrade the infrastructure but now i must to solve the problem then i can upgrade the infrastructure.
I have the sosreport of the server owner of the spm and the engine-log-collector --no-hypervisors of the server, but they are too big how i can upload ?
regards
Massimo

Comment 6 Massimo 2017-07-13 10:33:21 UTC
Created attachment 1297500 [details]
Manager log

Comment 7 Massimo 2017-07-13 10:34:07 UTC
Created attachment 1297501 [details]
SPM Owner

Comment 8 Massimo 2017-07-13 10:35:04 UTC
Created attachment 1297503 [details]
Host where there is the VM

Comment 9 Vasily Topolsky 2017-07-26 10:23:13 UTC
Any news here? Have the same issue after live storage migration with oVirt 4.0.6.3. Should i report a new bug?

Comment 10 Ala Hino 2017-07-26 11:59:58 UTC
We have couple of bugs in this area, BZ 1383301 and BZ 1467928 that I am looking at and I will also analyze the issue reported here.

Comment 11 Massimo 2017-08-31 08:27:10 UTC
Hi,
Someone can help me i must to fix the problem how i can do that
It's important because i must to dismiss the old storage
Regards 
Massimo

Comment 12 Sandro Bonazzola 2017-08-31 08:42:21 UTC
Ala, any progress on this? Last update is one month old.

Comment 13 Ala Hino 2017-09-12 09:05:10 UTC
Hi Massimo,

Apologize for the delay in my reply. I will work with you to fix the illegal disk snapshot issue.

Looking at the logs, I see that the snapshot was merged at node (host/Vdsm) side but during some merge step, we still see the volume hence, fail the merge and mark the disk snapshot as illegal.

I'd like to ask you to send me the volume info of all the volumes in the chain:
e8c5d918-a0a0-4285-af21-1f821de84cf3 and 543d39c7-d75b-415b-baee-fa8d76b00724. This can be done by running (on the node):

$ vdsClient -s 0 getVolumeInfo 8d9b7a04-0dce-465e-b778-d1bc60c90afb c6f4af45-0b09-4dfb-b391-338b82e580a9 c894df50-ff15-4d3e-9bc2-f74e087d7315 543d39c7-d75b-415b-baee-fa8d76b00724

$ vdsClient -s 0 getVolumeInfo 8d9b7a04-0dce-465e-b778-d1bc60c90afb c6f4af45-0b09-4dfb-b391-338b82e580a9 c894df50-ff15-4d3e-9bc2-f74e087d7315 e8c5d918-a0a0-4285-af21-1f821de84cf3

In addition, I'd like to ask you to try live merge again and send me the logs of the engine and the hosts (HSM and SPM).

Comment 14 Ala Hino 2017-09-12 09:47:31 UTC
Can you also run the following command on the node:

vdsClient -s 0 getVmsList c6f4af45-0b09-4dfb-b391-338b82e580a9 8d9b7a04-0dce-465e-b778-d1bc60c90afb

Comment 15 Massimo 2017-09-12 11:43:46 UTC
[root@ovirthpr02 ~]# vdsClient -s 0 getVolumeInfo 8d9b7a04-0dce-465e-b778-d1bc60c90afb c6f4af45-0b09-4dfb-b391-338b82e580a9 c894df50-ff15-4d3e-9bc2-f74e087d7315 543d39c7-d75b-415b-baee-fa8d76b00724
        status = OK
        domain = 8d9b7a04-0dce-465e-b778-d1bc60c90afb
        capacity = 322122547200
        voltype = INTERNAL
        description = {"DiskAlias":"Graphitep01_Disk2","DiskDescription":"vgdata"}
        parent = 00000000-0000-0000-0000-000000000000
        format = RAW
        image = c894df50-ff15-4d3e-9bc2-f74e087d7315
        uuid = 543d39c7-d75b-415b-baee-fa8d76b00724
        disktype = 2
        legality = LEGAL
        mtime = 0
        apparentsize = 322122547200
        truesize = 322122547200
        type = PREALLOCATED
        children = []
        pool =
        ctime = 1448627779

Comment 16 Massimo 2017-09-12 11:44:06 UTC
[root@ovirthpr02 ~]# vdsClient -s 0 getVolumeInfo 8d9b7a04-0dce-465e-b778-d1bc60c90afb c6f4af45-0b09-4dfb-b391-338b82e580a9 c894df50-ff15-4d3e-9bc2-f74e087d7315 e8c5d918-a0a0-4285-af21-1f821de84cf3
        status = OK
        domain = 8d9b7a04-0dce-465e-b778-d1bc60c90afb
        capacity = 322122547200
        voltype = LEAF
        description =
        parent = 543d39c7-d75b-415b-baee-fa8d76b00724
        format = COW
        image = c894df50-ff15-4d3e-9bc2-f74e087d7315
        uuid = e8c5d918-a0a0-4285-af21-1f821de84cf3
        disktype = 2
        legality = LEGAL
        mtime = 0
        apparentsize = 216895848448
        truesize = 216895848448
        type = SPARSE
        children = []
        pool =
        ctime = 1495460387

Comment 17 Massimo 2017-09-12 11:46:10 UTC
[root@ovirthpr02 ~]# vdsClient -s 0 getVolumeInfo 8d9b7a04-0dce-465e-b778-d1bc60c90afb c6f4af45-0b09-4dfb-b391-338b82e580a9 c894df50-ff15-4d3e-9bc2-f74e087d7315 e8c5d918-a0a0-4285-af21-1f821de84cf3
        status = OK
        domain = 8d9b7a04-0dce-465e-b778-d1bc60c90afb
        capacity = 322122547200
        voltype = LEAF
        description =
        parent = 543d39c7-d75b-415b-baee-fa8d76b00724
        format = COW
        image = c894df50-ff15-4d3e-9bc2-f74e087d7315
        uuid = e8c5d918-a0a0-4285-af21-1f821de84cf3
        disktype = 2
        legality = LEGAL
        mtime = 0
        apparentsize = 216895848448
        truesize = 216895848448
        type = SPARSE
        children = []
        pool =
        ctime = 1495460387

[root@ovirthpr02 ~]# vdsClient -s 0 getVolumeInfo 8d9b7a04-0dce-465e-b778-d1bc60c90afb c6f4af45-0b09-4dfb-b391-338b82e580a9 c894df50-ff15-4d3e-9bc2-f74e087d7315 543d39c7-d75b-415b-baee-fa8d76b00724
        status = OK
        domain = 8d9b7a04-0dce-465e-b778-d1bc60c90afb
        capacity = 322122547200
        voltype = INTERNAL
        description = {"DiskAlias":"Graphitep01_Disk2","DiskDescription":"vgdata"}
        parent = 00000000-0000-0000-0000-000000000000
        format = RAW
        image = c894df50-ff15-4d3e-9bc2-f74e087d7315
        uuid = 543d39c7-d75b-415b-baee-fa8d76b00724
        disktype = 2
        legality = LEGAL
        mtime = 0
        apparentsize = 322122547200
        truesize = 322122547200
        type = PREALLOCATED
        children = []
        pool =
        ctime = 1448627779
		
[root@ovirthpr02 ~]# vdsClient -s 0 getVmsList c6f4af45-0b09-4dfb-b391-338b82e580a9 8d9b7a04-0dce-465e-b778-d1bc60c90afb
Not SPM

Comment 18 Ala Hino 2017-09-12 12:46:26 UTC
Thanks, Massimo.

Can you please run the last command on the SPM?

In addition, will you be able to re-try live merge the snapshot with the illegal disk and upload the logs?

Comment 19 Massimo 2017-09-14 10:17:12 UTC
Hi this is the output of the command on the host owner of the spm

[root@ovirtpr03 ~]# vdsClient -s 0 getVmsList c6f4af45-0b09-4dfb-b391-338b82e580a9 8d9b7a04-0dce-465e-b778-d1bc60c90afb

================================
00206e0c-1bb6-4bb1-b0dd-95d73ac90e49
================================
008ad015-da85-4ecd-86d4-65ed5c0cb39d
================================
0400940b-1d6f-429e-9fa8-29e150060901
================================
05d5bb32-1b81-43d9-be23-c11407444d63
================================
06a6052d-ed9c-4819-8452-51981eb60c30
================================
0c446581-5f33-4954-9c6a-fa05d5545b6f
================================
1067000f-6e86-49b6-b0b1-a62de7427f6a
================================
142076be-213e-422e-ae33-6911d911c936
================================
1613ffb7-3ac5-4ca7-9b5d-a71db43472d6
================================
17a8a745-8077-4153-b831-b854282072f7
================================
1f19a785-70b4-455d-beaf-30d495201741
================================
25732228-446c-4780-ab0f-e87b6d50c753
================================
2cbe9c2b-0196-4a39-8b91-504220350620
================================
2eb75698-5ebe-40fd-9d1f-61171220061d
================================
36364b61-ca2e-44f8-9f38-001967a7decc
================================
3ec5ae3f-2383-4d47-967e-125c1f5df1ae
================================
41c99e43-b760-455d-82c7-fc9be2c2977e
================================
451ce920-11d4-417a-b405-e78959a92dc9
================================
46fbdc42-a81d-433a-a05b-0d8c664b6f43
================================
472a2c4f-b63b-4d4a-a4d9-e727ecf8d0e4
================================
4aea7a47-79f4-47ab-bafa-b2d541857902
================================
4cd5c320-7ecd-4e7d-be49-f847bbb84486
================================
4de51e2f-be6f-4bb6-a20d-c681333d54b0
================================
4e85de83-0667-43cc-904b-695b3fd26e0d
================================
526c23ff-8f91-4ee9-9114-1de587d904c8
================================
5533189f-ffaa-4892-9d0a-5ff9748447cb
================================
5cdd937f-e153-4693-a110-4024db67b6ce
================================
5ddf2742-93c8-4c09-814a-1cf24fcc5e0c
================================
624028ab-4f5c-4640-9004-51b0f094d6b5
================================
64fd4b55-f851-4e00-9f1d-b59c5c0e440c
================================
65f4e9f1-29ec-4d34-b448-3c204f3879a2
================================
65f96876-648c-4bb3-88f6-2ae8c98c2abf
================================
6ff3a156-4525-4fe0-b600-28caeccb073d
================================
71d203a9-1f34-4571-9083-fd4b27c55d21
================================
785fbe62-73cf-4652-b602-2ce66441edfc
================================
7a8da09b-2d03-423d-bf81-77f079e48482
================================
7c875096-b9f2-4acd-8b3c-c0bc50940074
================================
7d2b92c1-92eb-4ca1-92b8-7d1293ed61ed
================================
7de40820-9302-40dd-8821-716602147cdf
================================
7f85f231-dd20-4ef7-93ea-644abe859604
================================
7fa03d7b-2879-4175-b2ac-62c9b538938b
================================
804996c8-f3c9-4288-9f8a-1ba1592702f3
================================
856c07d6-87ff-487b-a114-2ea4596d5cce
================================
8877c861-f123-4d01-bed6-d9b5a339886b
================================
8ce96384-2815-4717-b8a1-66950f0617bd
================================
8f09e4d4-d49d-4409-858b-f122e2005881
================================
90318899-2507-43b9-815e-d67c2eb5b7cb
================================
910cddde-44c8-4a57-bcb5-b8f02afbd0ce
================================
91fff261-359f-4463-93f2-917f1086626b
================================
92078524-6102-4c03-96fe-a82fc0399375
================================
920ba79e-67e2-4e27-a15a-c2bc9e7c4920
================================
94ba78e4-9e7b-4a24-8897-cd726468a99e
================================
96db407e-109b-4084-a1f9-a469bd99fafd
================================
98b0345c-624d-404c-9eac-1ca9f1ddc204
================================
9b97bc23-aeb8-4a04-808a-55b654c55c75
================================
9d02fd0d-2cd8-495c-a405-44bb0a0aab7e
================================
9f2564cf-479a-4fd9-a660-0df14fb49c47
================================
a105d795-fad2-477e-aef0-13b32f37eabe
================================
a43d4a82-10f2-4d1f-80d7-4c4bd371b609
================================
a81b892b-0ca6-4998-a0f9-89569ebcaa6a
================================
ab9955db-d7d5-4b5b-bb06-79c89054cdbd
================================
ae251dab-da5f-4c19-adcd-16d9308df952
================================
afeb9c36-9eb3-4aa3-be2d-34077f1974ab
================================
b0e37d17-3b2e-476e-97a9-8c8a2a934ef9
================================
b1b06794-bfff-4bec-a2c5-c3ea9047b2b6
================================
b328121a-edac-4d02-933f-2083155a8792
================================
b48d09ab-c932-49f7-8ad0-48e2fcccee26
================================
b992d24a-d698-4c58-bbd1-d27060629211
================================
b9b621c8-12a2-4397-acf2-a37655c992d7
================================
c0510dc4-4767-4660-a652-6b0467131a57
================================
c0b1847f-4701-4ee9-b83b-ff94f5ef0f0a
================================
c295036e-fa0e-4145-8e2a-ca3d343d08db
================================
c5553c60-4d59-418e-9d16-923f5b52009a
================================
c83ce5bd-2f85-44b1-9991-7c971a5c967c
================================
caedcc32-7a65-4b89-9dca-028ded094445
================================
cb9ee4f2-a90d-4392-ab47-bfcfbd986312
================================
d510e736-dfea-4ac0-b4fb-35046d8bd05b
================================
d805ca72-33d5-4f07-8bf2-8a565f51c626
================================
da67aefc-d7d1-4968-87ab-6d310ecbe376
================================
dc2c8bb3-59a2-498a-8cec-840c1a9d2e00
================================
e00c470d-6779-4a0e-93f4-9906b9bcf3e4
================================
e5885956-d0ee-48f2-9f1f-324572cdb583
================================
e620d325-0746-4261-80ee-23a0c6b3bae0
================================
e83b9b41-b75b-4c65-a61c-191d51a128dd
================================
e8b15706-cd56-4180-886e-e3b03a93115f
================================
ebf7a661-9ccc-4983-8996-a814f4edd156
================================
ec50b2a6-5aef-41f7-bdae-0d24d2c2e7bb
================================
eff2a943-f911-4794-9b94-f5f8aaa5ec3d
================================
f13113f1-dbde-4ec2-9ce4-f0f493d49bde
================================
f52e9afa-07f5-4092-ad3f-c8bc2c945fb7
================================
f64afe61-7a9c-4e45-84bc-1e952b707cd2
================================
fecf84e4-6b97-4d4a-9c1c-a94cd0242a6e

Comment 20 Massimo 2017-09-14 10:23:39 UTC
Hi Ala,
I re-try the live merge of the snapshot with the illegal disk and the sistem two days ago and on the manager i haev the following message
	
Sep 12, 2017 1:47:33 PM
	
Removing Snapshot Auto-generated for Live Storage Migration of VM Graphitep01

I can't belive that it's necessary two day for removing the snapshot and the task still running

Comment 21 Massimo 2017-09-14 12:03:48 UTC
Created attachment 1325946 [details]
Manager LOG

Comment 22 Massimo 2017-09-14 12:08:13 UTC
Created attachment 1325949 [details]
VM Owner

Comment 23 Massimo 2017-09-14 12:09:22 UTC
Created attachment 1325950 [details]
SPM Owner

Comment 24 Yaniv Kaul 2017-10-25 11:56:10 UTC
Ala - any updates?

Comment 25 Ala Hino 2017-10-25 15:08:45 UTC
With latest changes in this area, this bug should be fixed as well. However, I am not moving it to MODIFIED because I want to provide the reporter a way to recover from this situation.

Comment 26 Ala Hino 2017-10-25 15:12:26 UTC
Massimo,

Can you please provide the following info?

1. On the host, run "virsh -r list" to find out the Id of the Vm
2. On the host, run "virsh -r dumpxml <Vm_id>"

Please upload the output of the second command.

3. On the engine, can you please run "ovirt-log-collector --no-hypervisors"?
Please upload the generated sosreport file.

Comment 27 Ala Hino 2017-10-29 09:17:08 UTC
This bug supposed to be fixed after merging https://gerrit.ovirt.org/82528.
Moving to MODIFIED.

One I get reply to comment #26, I will provide the reporter instructions to recover from the illegal disk status.

Comment 30 Avihai 2017-11-01 16:42:36 UTC
I did a similar scenario - delete snapshot after live merge & encountered what looks like a different issue that blocks me from verifying this bug.

New issue details at https://bugzilla.redhat.com/show_bug.cgi?id=1508560

Comment 31 Massimo 2017-11-02 17:59:36 UTC
Hi Ala,
I solve my problem in a diffrent way i create a new disk, i move the data, after that i shutdown the vm ad delete the disk with the snapshot.
I hope that I never have that problem again
Regards
Massimo

Comment 32 Ala Hino 2017-11-02 18:01:54 UTC
Hi Massimo,

I am glad you were able to resolve the problem.

We made quite major improvements in live merge area in 4.1.7 (and 4.1.8), that will significantly stabilize live merge.

Comment 33 Natalie Gavrielov 2017-12-27 12:54:30 UTC
Ala,

In order to verify this issue, performing Avihai's scenario described in comment 30 is good enough?

Comment 34 Ala Hino 2017-12-27 15:30:16 UTC
Basically, this issue is very hard to reproduce. In order to reproduce, I had to hard-coded add a sleep time in the code.

There are quite few bugs in this area that already verified. Still, I'd recommend trying live of the active layer, i.e. create a single snapshot, of a VM with multiple disks.
You can of course try Avihai's steps as well.

Comment 35 Natalie Gavrielov 2018-01-01 13:22:18 UTC
Verified using builds:
rhvm-4.2.0.2-0.1.el7.noarch
vdsm-4.20.9.3-1.el7ev.x86_64

Used scenarios described in comment 34 and comment 30.


Note You need to log in before you can comment on or make changes to this bug.