This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1466253 - RHV-M reports wrong storage consumption on GlusterFS store
RHV-M reports wrong storage consumption on GlusterFS store
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-dashboard (Show other bugs)
4.1.2
Unspecified Unspecified
unspecified Severity unspecified
: ovirt-4.1.6
: ---
Assigned To: Gobinda Das
Pavel Stehlik
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-29 07:06 EDT by Daniel Messer
Modified: 2017-08-23 02:33 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-23 02:33:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Gluster
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
screenshot for storage guage (168.05 KB, image/png)
2017-08-10 05:25 EDT, RamaKasturi
no flags Details
All the vms in the system (170.88 KB, image/png)
2017-08-10 05:25 EDT, RamaKasturi
no flags Details
dashboard_storage (156.66 KB, image/png)
2017-08-10 05:28 EDT, RamaKasturi
no flags Details

  None (edit)
Description Daniel Messer 2017-06-29 07:06:12 EDT
Description of problem:
The aggregate storage consumption reported by RHV-M does not add up to what's really stored. When tracked down to the individual disks it turns out that for thick-provisioned disks which are e.g. 5GiB provisioned, the reported actual capacity is 37-38 GiB.
On the other hand the dashboard page only reports the real consumed capacity on the datastore.

In my case:

8 VMs, each with a 10GiB thin-provisioned disk and a 5 GiB thick-provisioned disk. Combined data store usage as per RHV-M dashboard = 42 GiB. So far so correct. When looking at the detailed values by clicking the storage "gauge" on the dashboard I can see the consumers which are VMs that have between 39 and 42 GiB reported utilization alone. When looking at the individual disks it becomes clear the reason is that the 5 GiB thick provisioned disk are reported as 37-38 GiB each.
The GlusterFS mount is 3-way replicated. Even with replication taken into account the values don't add up. When looking at the Gluster mount the capacities of the files are reported correctly for the files but the total directory capacity is wrong:

ll -ah d4c1b3a2-9074-46ea-aeba-907981c10652

total 35G
drwxr-xr-x.  2 vdsm kvm 4.0K Jun 29 00:43 .
drwxr-xr-x. 20 vdsm kvm 4.0K Jun 29 10:24 ..
-rw-rw----.  1 vdsm kvm 5.0G Jun 29 10:21 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d
-rw-rw----.  1 vdsm kvm 1.0M Jun 29 00:32 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.lease
-rw-r--r--.  1 vdsm kvm  271 Jun 29 00:43 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.meta

On the bricks the reported size on the other hand is way to small:

ll -ah /gluster_bricks/vmstore/vmstore/b9c28b59-52bb-442a-a3a9-81f4035ee97a/images/d4c1b3a2-9074-46ea-aeba-907981c10652/
total 4.9M
drwxr-xr-x.  2 vdsm kvm  149 Jun 29 00:43 .
drwxr-xr-x. 20 vdsm kvm 8.0K Jun 29 10:24 ..
-rw-rw----.  2 vdsm kvm 4.0M Jun 29 10:21 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d
-rw-rw----.  2 vdsm kvm 1.0M Jun 29 00:32 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.lease
-rw-r--r--.  2 vdsm kvm  271 Jun 29 00:43 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.meta

Note that these disks were created during a "Clone VM" operation.

Version-Release number of selected component (if applicable):

RHHI with RHV-M 4.1.2

How reproducible:

Always.
Comment 1 Sahina Bose 2017-07-14 05:14:31 EDT
(In reply to Daniel Messer from comment #0)
> Description of problem:
> The aggregate storage consumption reported by RHV-M does not add up to
> what's really stored. When tracked down to the individual disks it turns out
> that for thick-provisioned disks which are e.g. 5GiB provisioned, the
> reported actual capacity is 37-38 GiB.
> On the other hand the dashboard page only reports the real consumed capacity
> on the datastore.
> 
> In my case:
> 
> 8 VMs, each with a 10GiB thin-provisioned disk and a 5 GiB thick-provisioned
> disk. Combined data store usage as per RHV-M dashboard = 42 GiB. So far so
> correct. When looking at the detailed values by clicking the storage "gauge"
> on the dashboard I can see the consumers which are VMs that have between 39
> and 42 GiB reported utilization alone. When looking at the individual disks
> it becomes clear the reason is that the 5 GiB thick provisioned disk are
> reported as 37-38 GiB each.
> The GlusterFS mount is 3-way replicated. Even with replication taken into
> account the values don't add up. When looking at the Gluster mount the
> capacities of the files are reported correctly for the files but the total
> directory capacity is wrong:
> 
> ll -ah d4c1b3a2-9074-46ea-aeba-907981c10652
> 
> total 35G
> drwxr-xr-x.  2 vdsm kvm 4.0K Jun 29 00:43 .
> drwxr-xr-x. 20 vdsm kvm 4.0K Jun 29 10:24 ..
> -rw-rw----.  1 vdsm kvm 5.0G Jun 29 10:21
> 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d
> -rw-rw----.  1 vdsm kvm 1.0M Jun 29 00:32
> 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.lease
> -rw-r--r--.  1 vdsm kvm  271 Jun 29 00:43
> 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.meta


Krutika, is there something related to shard in displaying size?

> 
> On the bricks the reported size on the other hand is way to small:
> 
> ll -ah
> /gluster_bricks/vmstore/vmstore/b9c28b59-52bb-442a-a3a9-81f4035ee97a/images/
> d4c1b3a2-9074-46ea-aeba-907981c10652/
> total 4.9M
> drwxr-xr-x.  2 vdsm kvm  149 Jun 29 00:43 .
> drwxr-xr-x. 20 vdsm kvm 8.0K Jun 29 10:24 ..
> -rw-rw----.  2 vdsm kvm 4.0M Jun 29 10:21
> 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d
> -rw-rw----.  2 vdsm kvm 1.0M Jun 29 00:32
> 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.lease
> -rw-r--r--.  2 vdsm kvm  271 Jun 29 00:43
> 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.meta

This is expected as only the first shard is created in this directory. All other shards are in .shard of brick.

> 
> Note that these disks were created during a "Clone VM" operation.
> 
> Version-Release number of selected component (if applicable):
> 
> RHHI with RHV-M 4.1.2
> 
> How reproducible:
> 
> Always.
Comment 4 Krutika Dhananjay 2017-07-17 02:47:43 EDT
This does look very similar to https://bugzilla.redhat.com/show_bug.cgi?id=1332861

Shard relies on the size and block count returned by posix translator (which directly talks to the disks) for its aggregated size accounting. With XFS preallocation, the number of blocks allocated in anticipation of further writes might be initially more than what is required. And it is the block count - preallocation included - that shard would have used for its accounting. But XFS, unbeknownst to shard, releases these blocks at a later point. I'm not sure if this difference can become this wide over time. Let me test that out and get back, in case there are other bugs/problems that are contributing to it.

-Krutika
Comment 5 Krutika Dhananjay 2017-07-17 03:26:35 EDT
Hi Daniel,

Could you provide the xattrs of the example files you mentioned in the Description of this bug from the brick?

# getfattr -d -m . -e hex <file-path-wrt-brick>


Specifically I'm looking for xattrs of the original image, .meta and .lease files listed below:


ll -ah /gluster_bricks/vmstore/vmstore/b9c28b59-52bb-442a-a3a9-81f4035ee97a/images/d4c1b3a2-9074-46ea-aeba-907981c10652/
total 4.9M
drwxr-xr-x.  2 vdsm kvm  149 Jun 29 00:43 .
drwxr-xr-x. 20 vdsm kvm 8.0K Jun 29 10:24 ..
-rw-rw----.  2 vdsm kvm 4.0M Jun 29 10:21 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d
-rw-rw----.  2 vdsm kvm 1.0M Jun 29 00:32 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.lease
-rw-r--r--.  2 vdsm kvm  271 Jun 29 00:43 3517fa85-42f9-4c0c-89fc-47b82d2e3a9d.meta
Comment 6 Sahina Bose 2017-07-31 03:25:11 EDT
Kasturi, can you check if you can reproduce this?
Comment 8 RamaKasturi 2017-08-10 05:24:41 EDT
Hi sahina,

    As far as i understood the bug description, i had tried to reproduce the scenario and below is what is my analysis.

1) Created a vm with 10GiB thin provisioned , installed OS .
2) Once the vm came up i have attached another disk of size 5GiB preallocated / thick provisioned.
3) powered off the vm and used the Clone VM button and cloned another 7vms.
4) powered all the machines on.
5) Now when i click on the storage gauge in the dashboard i see all the vms present in the system are listed with their current usage. Attached screenshot for the same.
6) Created 8 vms and totally there are 9 vms available in the system including hosted Engine and all of them show up in the storage gauge.

One issue i observed is total usage of data volume is 16.8 GiB where as storage gauge reports it only as 14.0 GiB.

@sahina, please let me know in case i have missed something from here.

Thanks
kasturi
Comment 9 RamaKasturi 2017-08-10 05:25 EDT
Created attachment 1311639 [details]
screenshot for storage guage
Comment 10 RamaKasturi 2017-08-10 05:25 EDT
Created attachment 1311640 [details]
All the vms in the system
Comment 11 RamaKasturi 2017-08-10 05:28:13 EDT
Since my disks can accommodate 12.8 Tera bytes of space and i have used very little out of that which is like ~19GB, dasbhoard does not display much usage.

Attaching screenshot for the same.
Comment 12 RamaKasturi 2017-08-10 05:28 EDT
Created attachment 1311642 [details]
dashboard_storage
Comment 13 RamaKasturi 2017-08-10 07:47:24 EDT
From the mount point:
==============================================
[root@rhsqa-grafton1 ~]# ll -ah /rhev/data-center/mnt/glusterSD/10.70.36.79\:_data/9a86cceb-f5fa-42ce-a457-ff594bc80263/images/c3858a8e-9ab3-4bf0-9867-b1e560f899d5/
total 5.1G
drwxr-xr-x.  2 vdsm kvm 4.0K Aug  9 11:39 .
drwxr-xr-x. 20 vdsm kvm 4.0K Aug 10 14:39 ..
-rw-rw----.  1 vdsm kvm 5.0G Aug  9 11:38 bb8ba404-5ced-4248-87b6-0052f196e00b
-rw-rw----.  1 vdsm kvm 1.0M Aug  9 11:39 bb8ba404-5ced-4248-87b6-0052f196e00b.lease
-rw-r--r--.  1 vdsm kvm  325 Aug  9 11:39 bb8ba404-5ced-4248-87b6-0052f196e00b.meta

From the brick backend: (shard size here is 64M)
================================================

[root@rhsqa-grafton1 ~]# ll -ah /gluster_bricks/data/data/9a86cceb-f5fa-42ce-a457-ff594bc80263/images/c3858a8e-9ab3-4bf0-9867-b1e560f899d5/
total 66M
drwxr-xr-x.  2 vdsm kvm  165 Aug  9 11:39 .
drwxr-xr-x. 20 vdsm kvm 8.0K Aug 10 14:39 ..
-rw-rw----.  2 vdsm kvm  64M Aug  9 11:38 bb8ba404-5ced-4248-87b6-0052f196e00b
-rw-rw----.  2 vdsm kvm 1.0M Aug  9 11:39 bb8ba404-5ced-4248-87b6-0052f196e00b.lease
-rw-r--r--.  2 vdsm kvm  325 Aug  9 11:39 bb8ba404-5ced-4248-87b6-0052f196e00b.meta
Comment 14 Sahina Bose 2017-08-23 02:33:18 EDT
Closing this as per Comment 13 as we could not reproduce the issue. Please re-open if you face this issue again, and can provide steps to recreate it.

Note You need to log in before you can comment on or make changes to this bug.