Bug 1459233
| Summary: | volume creation fails with insufficient free space error although there is enough free space available | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | krishnaram Karthick <kramdoss> |
| Component: | heketi | Assignee: | Michael Adam <madam> |
| Status: | CLOSED ERRATA | QA Contact: | vinutha <vinug> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | cns-3.6 | CC: | akrishna, asriram, hchiramm, jmulligan, kramdoss, madam, pprakash, rhs-bugs, rreddy, rtalur, sankarshan, sarumuga, storage-qa-internal |
| Target Milestone: | --- | ||
| Target Release: | CNS 3.10 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | heketi-7.0.0-6.el7rhgs | Doc Type: | Bug Fix |
| Doc Text: |
Previously, Heketi would attempt creation of bricks on devices with insufficient space because of an inconsistency in the free space accounting on the device in Heketi's database. This fix corrects the accounting logic during error handling to prevent the inconsistency.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-09-12 09:22:12 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1444749 | ||
| Bug Blocks: | 1568861 | ||
|
Description
krishnaram Karthick
2017-06-06 15:05:11 UTC
This seems to be a duplicate of bug #1444749. Because it invovles deletes... scale testing would not necessarily involve a mix of creates and deletes. This issue is seen even without any volumes being deleted. New set of logs shall be attached shortly. Moving the bug back. [root@dhcp46-68 ~]# heketi-cli volume create --size=1 Error: Unable to execute command on glusterfs-fgsk5: Volume group "vg_ad555009def9baae0ba65b24a5afbdc8" has insufficient free space (23 extents): 256 required. [root@dhcp46-68 ~]# heketi-cli node info 40750252f1e3afbfea9a00f04b65a7a6 Node Id: 40750252f1e3afbfea9a00f04b65a7a6 State: online Cluster Id: d86f9226ca8660768aac165bc153ca0f Zone: 1 Management Hostname: dhcp46-26.lab.eng.blr.redhat.com Storage Hostname: 10.70.46.26 Devices: Id:09e88fe8329f128a47ffdc1ec3873233 Name:/dev/sdi State:online Size (GiB):99 Used (GiB):43 Free (GiB):56 Id:5cf9e19d3644148deb37be38e1593321 Name:/dev/sde State:online Size (GiB):99 Used (GiB):42 Free (GiB):57 Id:79849730260b3da1a84021138598e760 Name:/dev/sdj State:online Size (GiB):99 Used (GiB):44 Free (GiB):55 Id:cd2fdfb6cfaf2c068dc4e9765914254d Name:/dev/sdd State:online Size (GiB):99 Used (GiB):45 Free (GiB):54 Id:e7bdd15d708892f31aa7952ed7e26d29 Name:/dev/sdg State:online Size (GiB):99 Used (GiB):97 Free (GiB):2 Id:f4eb569019bcb482edaa3bd8e8a2ef75 Name:/dev/sdf State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 Id:fcb5ee83a2b6e94c4bcf208c3a356495 Name:/dev/sdh State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 [root@dhcp46-68 ~]# heketi-cli node info 8011c70e910300fdcc47371a20988289 Node Id: 8011c70e910300fdcc47371a20988289 State: online Cluster Id: d86f9226ca8660768aac165bc153ca0f Zone: 2 Management Hostname: dhcp46-214.lab.eng.blr.redhat.com Storage Hostname: 10.70.46.214 Devices: Id:0f3a2894165737740f6905c025591951 Name:/dev/sde State:online Size (GiB):99 Used (GiB):52 Free (GiB):47 Id:b9f3b771b2af55a13809122d42fb1552 Name:/dev/sdi State:online Size (GiB):99 Used (GiB):29 Free (GiB):70 Id:c5f31a5d2e8f1acd9c633856ea228b67 Name:/dev/sdh State:online Size (GiB):99 Used (GiB):44 Free (GiB):55 Id:cbec32659243012771d34a328c96a04f Name:/dev/sdg State:online Size (GiB):99 Used (GiB):47 Free (GiB):52 Id:d051c6967ac778fcf123a6b6ff4b69e6 Name:/dev/sdj State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 Id:e5e3abba7ae56937010e55b603f61cd6 Name:/dev/sdd State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 Id:eb8d5f6181832f38f7237a19d0169ced Name:/dev/sdf State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 [root@dhcp46-68 ~]# heketi-cli node info df597bba83cd6b9d0fa3f58693f3207c Node Id: df597bba83cd6b9d0fa3f58693f3207c State: online Cluster Id: d86f9226ca8660768aac165bc153ca0f Zone: 1 Management Hostname: dhcp47-39.lab.eng.blr.redhat.com Storage Hostname: 10.70.47.39 Devices: Id:135eccfad8fff5b8d22df0bf9fa62435 Name:/dev/sdi State:online Size (GiB):99 Used (GiB):16 Free (GiB):83 Id:40add89337d6585c70149d8b5754e689 Name:/dev/sdd State:online Size (GiB):99 Used (GiB):26 Free (GiB):73 Id:a13ceae201fb7d48719f5941a6e2e15c Name:/dev/sdf State:online Size (GiB):99 Used (GiB):32 Free (GiB):67 Id:ad555009def9baae0ba65b24a5afbdc8 Name:/dev/sde State:online Size (GiB):99 Used (GiB):98 Free (GiB):1 Id:cc67894da6329bab6cd8c1962f6238ae Name:/dev/sdg State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 Id:de5f8d3d74e708321200d30be876f799 Name:/dev/sdh State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 Id:f909be351a83e9fc7db36d15354f1d7f Name:/dev/sdj State:online Size (GiB):99 Used (GiB):99 Free (GiB):0 On thing to note here : the subjected VG only has : Id:ad555009def9baae0ba65b24a5afbdc8 Name:/dev/sde State:online Size (GiB):99 Used (GiB):98 Free (GiB):1 (In reply to Humble Chirammal from comment #9) > On thing to note here : the subjected VG only has : > > Id:ad555009def9baae0ba65b24a5afbdc8 Name:/dev/sde State:online > Size (GiB):99 Used (GiB):98 Free (GiB):1 How many volumes exist in this cluster? It would be also appreciated if you can attach lvm ( pvs,vgs,lvs..etc) outputs ? I agree to move this out of CNS release considering the complexity of the fix and by the fact that this is a bug from 0-day. The cause of this bug is db inconsistency between heketi and LVM's VG free space. Heketi considered free space on a device more than the actual free space. With the current builds, this inconsistency about free space on device should not happen. However, test case is little difficult because such inconsistency was introduced only when 1. gluster did not respond after volume creation 2. heketi could not perform kubeexec after volume creation Possible way to hit such bug: while creating and deleting PVCs in a loop, reboot OCP master node I will add more deterministic test case if I find one. Please start with one suggested above. Thankyou John, I have updated based on the feedback given. Thankyou John, I have updated based on the feedback given. Doc Text looks OK Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2686 |