Bug 1412455
Summary: | [Bug] Gluster brick created with RHEV manager is overallocated | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Sachin Raje <sraje> |
Component: | vdsm | Assignee: | Gobinda Das <godas> |
Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.6.9 | CC: | bkunal, godas, lsurette, mpillai, nkshirsa, pstehlik, ratamir, sabose, sasundar, sraje, srevivo, tjelinek, ycui, ykaul |
Target Milestone: | ovirt-4.2.0 | Keywords: | Reopened |
Target Release: | 4.2.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
If this bug requires documentation, please select an appropriate Doc Type value.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-05-15 17:49:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Gluster | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Sachin Raje
2017-01-12 04:42:14 UTC
if the pv, the vg and the thinly provisioned lv are all the same size, where do we store the metadata? Don't we need to reserve at least some space? --- Physical volume --- PV Name /dev/mapper/mpathb VG Name vg-brick04 PV Size 36.33 TiB / not usable 0 Allocatable yes PE Size 1.25 MiB Total PE 30478511 Free PE 1 Allocated PE 30478510 PV UUID CQdZxE-Lj3D-kuRD-zYma-cn8t-3s9P-hCo30s --- Volume group --- VG Name vg-brick04 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 7 VG Access read/write VG Status resizable MAX LV 0 Cur LV 2 Open LV 1 Max PV 0 Cur PV 1 Act PV 1 VG Size 36.33 TiB PE Size 1.25 MiB Total PE 30478511 Alloc PE / Size 30478510 / 36.33 TiB Free PE / Size 1 / 1.25 MiB VG UUID PKYtXv-Zk4Z-6b5a-k6Ym-JSO3-5uxW-NyIsZZ --- Logical volume --- LV Path /dev/vg-brick04/brick04 LV Name brick04 VG Name vg-brick04 LV UUID 2owAeG-qPIC-GFK7-l6yC-F4fF-XYAj-5JMMZl LV Write Access read/write LV Creation host, time svg302.sst.rad.lan, 2016-08-11 10:36:52 +0200 LV Pool name pool-brick04 LV Status available # open 1 LV Size 36.33 TiB Mapped size 97.49% Current LE 30478511 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:27 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert PE Ranges pool-brick04 vg-brick04 twi-aot--- 36.32t 97.53 5.53 pool-brick04_tdata:0-30465402 [pool-brick04_tdata] vg-brick04 Twi-ao---- 36.32t /dev/mapper/mpathb:0-30465402 [pool-brick04_tmeta] vg-brick04 ewi-ao---- 16.00g /dev/mapper/mpathb:30465403-30478509 --------------------- This is the reason storage team suggested to set 'thin_pool_autoextend_threshold" less than 100 and the exact value should be decided by the admin or tool which creates it ? This bug is about managing gluster bricks in RHV, not about how RHV consumes it. Sahina - I'm assigning it to you for initial research. I guess the component should be changed too, but I'm not sure to what. Ramesh - can you take a look? (In reply to Sachin Raje from comment #1) > if the pv, the vg and the thinly provisioned lv are all the same size, where > do we store the metadata? Don't we need to reserve at least some space? > > --- Physical volume --- > PV Name /dev/mapper/mpathb > VG Name vg-brick04 > PV Size 36.33 TiB / not usable 0 > Allocatable yes > PE Size 1.25 MiB > Total PE 30478511 > Free PE 1 > Allocated PE 30478510 > PV UUID CQdZxE-Lj3D-kuRD-zYma-cn8t-3s9P-hCo30s > > --- Volume group --- > VG Name vg-brick04 > System ID > Format lvm2 > Metadata Areas 1 > Metadata Sequence No 7 > VG Access read/write > VG Status resizable > MAX LV 0 > Cur LV 2 > Open LV 1 > Max PV 0 > Cur PV 1 > Act PV 1 > VG Size 36.33 TiB > PE Size 1.25 MiB > Total PE 30478511 > Alloc PE / Size 30478510 / 36.33 TiB > Free PE / Size 1 / 1.25 MiB > VG UUID PKYtXv-Zk4Z-6b5a-k6Ym-JSO3-5uxW-NyIsZZ > > --- Logical volume --- > LV Path /dev/vg-brick04/brick04 > LV Name brick04 > VG Name vg-brick04 > LV UUID 2owAeG-qPIC-GFK7-l6yC-F4fF-XYAj-5JMMZl > LV Write Access read/write > LV Creation host, time svg302.sst.rad.lan, 2016-08-11 10:36:52 +0200 > LV Pool name pool-brick04 > LV Status available > # open 1 > LV Size 36.33 TiB > Mapped size 97.49% > Current LE 30478511 > Segments 1 > Allocation inherit > Read ahead sectors auto > - currently set to 8192 > Block device 253:27 > > LV VG Attr LSize Pool Origin > Data% Meta% Move Log Cpy%Sync Convert PE Ranges > pool-brick04 vg-brick04 twi-aot--- 36.32t 97.53 > 5.53 pool-brick04_tdata:0-30465402 > [pool-brick04_tdata] vg-brick04 Twi-ao---- 36.32t > /dev/mapper/mpathb:0-30465402 > [pool-brick04_tmeta] vg-brick04 ewi-ao---- 16.00g > /dev/mapper/mpathb:30465403-30478509 > > --------------------- > > This is the reason storage team suggested to set > 'thin_pool_autoextend_threshold" less than 100 and the exact value should be > decided by the admin or tool which creates it ? We are not modifying the RHEL default values of 'thin_pool_autoextend_threshold". But in Gluster brick creation, 16GB is reserved for metadata. So in this case, lv is overallocated by 16GB. We should reduce this from thinn lv. But how it will work with lvm snapshots? Is there a possibility of facing the same problem while using lvm snapshots? This has be to fixed in vdsm-gluster. So changing the component accordingly. I will post a patch a to match the size of lv with the size of thinpool. This can guarantee that there is no over allocation as long as there is no LVM Snaphost/Gluster Volume snapshot. Before the fix: You can see that Current LE in LV is higher then current LV in thinpool --- Logical volume --- LV Name pool-brick1 VG Name vg-brick1 LV UUID Ynk3ou-kAnd-MuZe-d1Qn-owPQ-DHT5-fKwULZ LV Write Access read/write LV Creation host, time headwig.lab.eng.blr.redhat.com, 2017-02-20 17:06:12 +0530 LV Pool metadata pool-brick1_tmeta LV Pool data pool-brick1_tdata LV Status available # open 2 LV Size 1.80 TiB Allocated pool data 0.05% Allocated metadata 0.01% Current LE 1512651 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:22 --- Logical volume --- LV Path /dev/vg-brick1/brick1 LV Name brick1 VG Name vg-brick1 LV UUID mH035s-CciV-By4G-RIIi-sHP0-oxaX-ZnFLoF LV Write Access read/write LV Creation host, time headwig.lab.eng.blr.redhat.com, 2017-02-20 17:06:20 +0530 LV Pool name pool-brick1 LV Status available # open 1 LV Size 1.82 TiB Mapped size 0.05% Current LE 1525759 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:24 After the fix: You can see that Current LE in thinpool and lv matches exactly. --- Logical volume --- LV Name pool-brick2 VG Name vg-brick2 LV UUID nDngM3-tL4q-3Xaf-Azqn-ljHR-psZf-u8FQAr LV Write Access read/write LV Creation host, time headwig.lab.eng.blr.redhat.com, 2017-02-20 18:14:41 +0530 LV Pool metadata pool-brick2_tmeta LV Pool data pool-brick2_tdata LV Status available # open 2 LV Size 1.80 TiB Allocated pool data 0.05% Allocated metadata 0.01% Current LE 1512651 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:27 --- Logical volume --- LV Path /dev/vg-brick2/brick2 LV Name brick2 VG Name vg-brick2 LV UUID SQgqXI-n564-6AkX-aJlR-KpM5-ajYD-IqDdRc LV Write Access read/write LV Creation host, time headwig.lab.eng.blr.redhat.com, 2017-02-20 18:14:50 +0530 LV Pool name pool-brick2 LV Status available # open 1 LV Size 1.80 TiB Mapped size 0.05% Current LE 1512651 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:29 Ramesh - The patch attached to this BZ has been merged for several months. If this is indeed a z-stream bug, it needs to be cloned and the patch needs to be backported. If it isn't, please retarget it to 4.2.0. moving to Verified with code: -------------------------- vdsm-4.20.9-1.git8d0bd46.el7.centos.x86_64 Verified with scenario: -------------------------- 1. I attached a 110GB disk to my host 2. I created a brick on via host storage devices - create brick 3. The result is that the Logical Extents for the LV and the thin pool are the same. Moving to VERIFIED --- Logical volume --- LV Name pool-brick2 VG Name vg-brick2 LV UUID nLGNxf-zl7d-Obh5-Vqbd-ixCb-z4hL-0GhTDw LV Write Access read/write LV Creation host, time vm-83-162.scl.lab.tlv.redhat.com, 2017-12-05 14:29:46 +0200 LV Pool metadata pool-brick2_tmeta LV Pool data pool-brick2_tdata LV Status available # open 2 LV Size <99.50 GiB Allocated pool data 0.05% Allocated metadata 0.03% *****IS THIS CORRECT AS EXPECTED ??? ***** Current LE 407548 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:3 --- Logical volume --- LV Path /dev/vg-brick2/brick2 LV Name brick2 VG Name vg-brick2 LV UUID lG5xm2-c6el-NOGc-Jgv5-SCYF-CYtS-HoRRnv LV Write Access read/write LV Creation host, time vm-83-162.scl.lab.tlv.redhat.com, 2017-12-05 14:29:52 +0200 LV Pool name pool-brick2 LV Status available # open 1 LV Size <99.50 GiB Mapped size 0.05% Current LE 407548 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:5 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1489 BZ<2>Jira Resync sync2jira sync2jira sync2jira sync2jira |