Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1318702

Summary: dd command between 2 vm's disks on FC domain, cause the destination device to grow more than it should
Product: [oVirt] ovirt-engine Reporter: Raz Tamir <ratamir>
Component: BLL.StorageAssignee: Nir Soffer <nsoffer>
Status: CLOSED NOTABUG QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.4CC: acanan, bugs, nsoffer, ratamir, tnisan
Target Milestone: ovirt-3.6.5Keywords: Automation
Target Release: ---Flags: amureini: ovirt-3.6.z?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-21 12:06:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log
none
engine and vdsm logs none

Description Raz Tamir 2016-03-17 14:27:20 UTC
Created attachment 1137405 [details]
engine log

Description of problem:
I have vm with 2 disks on FC domain, both thin-provisiond 10GB. 1 disk is the bootable with 6.7el. The second disk is empty (without FS).

From inside the guest I'm performing dd command from the first disk with the OS to the emtpy disk:
dd if=/dev/vdb of=/dev/vdc bs=1M oflag=direct

The second disk actual size before the dd is 1GB:<actual_size>1073741824</actual_size>

The first disk actual size is 4.7GB:
<actual_size>4697620480</actual_size>

After the dd, the second disk's actual size increases more than it should to 11GB:
<actual_size>11811160064</actual_size>



Version-Release number of selected component (if applicable):
vdsm-4.17.23.1-0.el7ev.noarch
rhevm-3.6.4-0.1.el6.noarch

How reproducible:
90%

Steps to Reproduce:
1. Create vm + 1 10GB thin-provision disk on FC domain with OS
2. Create second 10GB thin-provisioned disk on FC domain, attach it to the vm
3. perform dd from the guest:
   # dd if=/dev/vdb of=/dev/vdc bs=1M oflag=direct

Actual results:
explained above


Expected results:


Additional info:

Comment 1 Yaniv Kaul 2016-03-20 10:58:47 UTC
- vdsm.log is missing.
- Is that a regression?
- Are you sure that it's not just a reporting issue - has the size actually grown? What do you see with 'qemu-img info' ?
- How much did you expect it to grow? (I'm missing a 'count=X' parameter to the 'dd' command)
- Did you take into account qcow2 metadata size?

Comment 2 Raz Tamir 2016-03-20 12:00:41 UTC
- Attaching new vdsm and engine log
- I'm not sure if its a regression or not
- Yes the size is growing:
vdsClient reports that the size of the destination disk before the dd command is: truesize = 1073741824 (1GB) and capacity = 10737418240 (10GB).

The size of the source is (according to vdsClient):
truesize = 4160749568 (4GB)
capacity = 10737418240 (10GB)

After the dd command the destination disk size is 11GB:
truesize = 11811160064
capacity = 10737418240

- I'm expecting it to be 4GB like the source disk - This works just fine using nfs, iscsi and gluster.

- The metadata should be 128MB (extent size).

Important info that I see from the engine logs is that the vm entering into pause state during the operation:

2016-03-20 13:48:50,768 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-88) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM test has been paused due to no Storage space error.

Comment 3 Raz Tamir 2016-03-20 12:01:57 UTC
Created attachment 1138253 [details]
engine and vdsm logs

Comment 4 Yaniv Kaul 2016-03-20 14:57:34 UTC
(In reply to ratamir from comment #3)
> Created attachment 1138253 [details]
> engine and vdsm logs

Please compress (.gz or friends) logs. A ~18MB log is slow to download over WAN (remotees).

Comment 5 Yaniv Kaul 2016-03-20 14:59:56 UTC
(In reply to ratamir from comment #2)
> - Attaching new vdsm and engine log
> - I'm not sure if its a regression or not

That's important piece of information.

> - Yes the size is growing:
> vdsClient reports that the size of the destination disk before the dd
> command is: truesize = 1073741824 (1GB) and capacity = 10737418240 (10GB).
> 
> The size of the source is (according to vdsClient):
> truesize = 4160749568 (4GB)
> capacity = 10737418240 (10GB)
> 
> After the dd command the destination disk size is 11GB:
> truesize = 11811160064
> capacity = 10737418240
> 
> - I'm expecting it to be 4GB like the source disk - This works just fine
> using nfs, iscsi and gluster.

But it was extended, right? How many times was it extended? (please look in VDSM logs and find out). By how much in each extend?

> 
> - The metadata should be 128MB (extent size).
> 
> Important info that I see from the engine logs is that the vm entering into
> pause state during the operation:
> 
> 2016-03-20 13:48:50,768 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ForkJoinPool-1-worker-88) [] Correlation ID: null, Call Stack: null, Custom
> Event ID: -1, Message: VM test has been paused due to no Storage space error.

Makes sense. FC is faster - and you might be working against a slow storage (or some other bottleneck)

Comment 6 Raz Tamir 2016-03-20 15:22:02 UTC
This is new to our automation but the manual guys said it is a regression.

It was extended 10 times in 1024 each time:

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 2048

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 3072

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 4096

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 5120

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 6144

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 7168

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 8192

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 9216

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 10240

- extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 11264

Comment 7 Yaniv Kaul 2016-03-20 15:27:38 UTC
(In reply to ratamir from comment #6)
> This is new to our automation but the manual guys said it is a regression.
> 
> It was extended 10 times in 1024 each time:
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 2048
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 3072
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 4096
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 5120
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 6144
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 7168
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 8192
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 9216
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 10240
> 
> - extending volume 3bf94ddc-355d-490b-a880-aa34b285a464 in domain
> 26ad735d-4630-41bb-8374-ddf99e0e7090 (pool
> f5d256d9-b7b6-4009-8f98-d92285dcf549) to size 11264

So now you know why it's 11G ;-)

Good, now please see WHY engine thought it needs to be extended 10 times.

Comment 8 Allon Mureinik 2016-03-21 10:09:33 UTC
Nir, as the QA contact, please take a look.
Thanks!

Comment 9 Nir Soffer 2016-03-21 10:59:04 UTC
There is no bug here:

    dd if=/dev/vdb of=/dev/vdc bs=1M oflag=direct

Will copy the entire device (virtualsize = 10G) into the other device. We extend
disks up to (virtualsize * 1.1) for additional qcow overhead, so getting 11G 
disk at the end is expected.