Bug 1459233 - volume creation fails with insufficient free space error although there is enough free space available
Summary: volume creation fails with insufficient free space error although there is en...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: heketi
Version: cns-3.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: CNS 3.10
Assignee: Michael Adam
QA Contact: vinutha
URL:
Whiteboard:
Depends On: 1444749
Blocks: 1568861
TreeView+ depends on / blocked
 
Reported: 2017-06-06 15:05 UTC by krishnaram Karthick
Modified: 2019-09-02 17:27 UTC (History)
13 users (show)

Fixed In Version: heketi-7.0.0-6.el7rhgs
Doc Type: Bug Fix
Doc Text:
Previously, Heketi would attempt creation of bricks on devices with insufficient space because of an inconsistency in the free space accounting on the device in Heketi's database. This fix corrects the accounting logic during error handling to prevent the inconsistency.
Clone Of:
Environment:
Last Closed: 2018-09-12 09:22:12 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:2686 0 None None None 2018-09-12 09:23:25 UTC

Description krishnaram Karthick 2017-06-06 15:05:11 UTC
Description of problem:

# heketi-cli volume create --size=1

[heketi] ERROR 2017/06/06 14:42:39 /src/github.com/heketi/heketi/apps/glusterfs/app_volume.go:148: Failed to create volume: Unable to execute command on glusterfs-5zsnm:   Volume group "vg_30243923179afa26036a3444176714d7" has insufficient free space (136 extents): 256 required.

heketi-cli node info 1e41507e2b982539e1d971eed46d4b70
Node Id: 1e41507e2b982539e1d971eed46d4b70
State: online
Cluster Id: b44591150662482c6b609e653aea0f62
Zone: 2
Management Hostname: dhcp46-190.lab.eng.blr.redhat.com
Storage Hostname: 10.70.46.190
Devices:
Id:af1808bcdd09a469323d9419aed03ec2   Name:/dev/sdg            State:online    Size (GiB):49      Used (GiB):19      Free (GiB):30      
Id:b69ae62d222ba91b98dfed6a7c501d16   Name:/dev/sdi            State:online    Size (GiB):499     Used (GiB):99      Free (GiB):400     
Id:bab673f21ec0e223270c0854fe3fed4b   Name:/dev/sdh            State:online    Size (GiB):49      Used (GiB):41      Free (GiB):8       
[root@dhcp46-161 yum.repos.d]# heketi-cli node info 4075d83ee08abed9b6940b95acbf4bcf
Node Id: 4075d83ee08abed9b6940b95acbf4bcf
State: online
Cluster Id: b44591150662482c6b609e653aea0f62
Zone: 1
Management Hostname: dhcp47-59.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.59
Devices:
Id:30243923179afa26036a3444176714d7   Name:/dev/sdd            State:online    Size (GiB):49      Used (GiB):15      Free (GiB):34      
Id:6beeb0f1a57ded472a19a97a8ae44b0b   Name:/dev/sde            State:online    Size (GiB):49      Used (GiB):24      Free (GiB):25      
Id:ec24740ad68a195c97994f2a7e3221f6   Name:/dev/sdf            State:online    Size (GiB):499     Used (GiB):120     Free (GiB):378     
[root@dhcp46-161 yum.repos.d]# heketi-cli node info aa2335269c612d157a3b665e39f03c2d
Node Id: aa2335269c612d157a3b665e39f03c2d
State: online
Cluster Id: b44591150662482c6b609e653aea0f62
Zone: 1
Management Hostname: dhcp46-65.lab.eng.blr.redhat.com
Storage Hostname: 10.70.46.65
Devices:
Id:290d310177a56866ee15bbf028e39f69   Name:/dev/sdd            State:online    Size (GiB):49      Used (GiB):19      Free (GiB):30      
Id:77d2351bbf29031850e06b4ec0a8a14a   Name:/dev/sde            State:online    Size (GiB):49      Used (GiB):25      Free (GiB):24      
Id:fbc5c9ece5e6046a01295722afe23753   Name:/dev/sdf            State:online    Size (GiB):499     Used (GiB):115     Free (GiB):383     
[root@dhcp46-161 yum.repos.d]# 


Version-Release number of selected component (if applicable):
rhgs3/rhgs-volmanager-rhel7:3.3.0-1
heketi-client-5.0.0-1.el7rhgs.x86_64
rhgs3/rhgs-server-rhel7:3.3.0-2

How reproducible:
1/1

Steps to Reproduce:
1. create and delete a couple of volumes via heketi. clean up of deleted volume will fail due to 1444749
2. Add sufficient disks and try to create 300 volumes


Actual results:
volume creation fails with insufficient disk space error

Expected results:
Volume creation should succeed

Additional info:

Comment 3 Michael Adam 2017-06-13 15:30:47 UTC
This seems to be a duplicate of bug #1444749. Because it invovles deletes...

Comment 7 Michael Adam 2017-06-13 15:37:02 UTC
scale testing would not necessarily involve a mix of creates and deletes.

Comment 8 krishnaram Karthick 2017-06-16 10:58:07 UTC
This issue is seen even without any volumes being deleted. New set of logs shall be attached shortly. Moving the bug back.

[root@dhcp46-68 ~]# heketi-cli volume create --size=1
Error: Unable to execute command on glusterfs-fgsk5:   Volume group "vg_ad555009def9baae0ba65b24a5afbdc8" has insufficient free space (23 extents): 256 required.

[root@dhcp46-68 ~]# heketi-cli node info 40750252f1e3afbfea9a00f04b65a7a6
Node Id: 40750252f1e3afbfea9a00f04b65a7a6
State: online
Cluster Id: d86f9226ca8660768aac165bc153ca0f
Zone: 1
Management Hostname: dhcp46-26.lab.eng.blr.redhat.com
Storage Hostname: 10.70.46.26
Devices:
Id:09e88fe8329f128a47ffdc1ec3873233   Name:/dev/sdi            State:online    Size (GiB):99      Used (GiB):43      Free (GiB):56      
Id:5cf9e19d3644148deb37be38e1593321   Name:/dev/sde            State:online    Size (GiB):99      Used (GiB):42      Free (GiB):57      
Id:79849730260b3da1a84021138598e760   Name:/dev/sdj            State:online    Size (GiB):99      Used (GiB):44      Free (GiB):55      
Id:cd2fdfb6cfaf2c068dc4e9765914254d   Name:/dev/sdd            State:online    Size (GiB):99      Used (GiB):45      Free (GiB):54      
Id:e7bdd15d708892f31aa7952ed7e26d29   Name:/dev/sdg            State:online    Size (GiB):99      Used (GiB):97      Free (GiB):2       
Id:f4eb569019bcb482edaa3bd8e8a2ef75   Name:/dev/sdf            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
Id:fcb5ee83a2b6e94c4bcf208c3a356495   Name:/dev/sdh            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
[root@dhcp46-68 ~]# heketi-cli node info 8011c70e910300fdcc47371a20988289
Node Id: 8011c70e910300fdcc47371a20988289
State: online
Cluster Id: d86f9226ca8660768aac165bc153ca0f
Zone: 2
Management Hostname: dhcp46-214.lab.eng.blr.redhat.com
Storage Hostname: 10.70.46.214
Devices:
Id:0f3a2894165737740f6905c025591951   Name:/dev/sde            State:online    Size (GiB):99      Used (GiB):52      Free (GiB):47      
Id:b9f3b771b2af55a13809122d42fb1552   Name:/dev/sdi            State:online    Size (GiB):99      Used (GiB):29      Free (GiB):70      
Id:c5f31a5d2e8f1acd9c633856ea228b67   Name:/dev/sdh            State:online    Size (GiB):99      Used (GiB):44      Free (GiB):55      
Id:cbec32659243012771d34a328c96a04f   Name:/dev/sdg            State:online    Size (GiB):99      Used (GiB):47      Free (GiB):52      
Id:d051c6967ac778fcf123a6b6ff4b69e6   Name:/dev/sdj            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
Id:e5e3abba7ae56937010e55b603f61cd6   Name:/dev/sdd            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
Id:eb8d5f6181832f38f7237a19d0169ced   Name:/dev/sdf            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
[root@dhcp46-68 ~]# heketi-cli node info df597bba83cd6b9d0fa3f58693f3207c
Node Id: df597bba83cd6b9d0fa3f58693f3207c
State: online
Cluster Id: d86f9226ca8660768aac165bc153ca0f
Zone: 1
Management Hostname: dhcp47-39.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.39
Devices:
Id:135eccfad8fff5b8d22df0bf9fa62435   Name:/dev/sdi            State:online    Size (GiB):99      Used (GiB):16      Free (GiB):83      
Id:40add89337d6585c70149d8b5754e689   Name:/dev/sdd            State:online    Size (GiB):99      Used (GiB):26      Free (GiB):73      
Id:a13ceae201fb7d48719f5941a6e2e15c   Name:/dev/sdf            State:online    Size (GiB):99      Used (GiB):32      Free (GiB):67      
Id:ad555009def9baae0ba65b24a5afbdc8   Name:/dev/sde            State:online    Size (GiB):99      Used (GiB):98      Free (GiB):1       
Id:cc67894da6329bab6cd8c1962f6238ae   Name:/dev/sdg            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
Id:de5f8d3d74e708321200d30be876f799   Name:/dev/sdh            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0       
Id:f909be351a83e9fc7db36d15354f1d7f   Name:/dev/sdj            State:online    Size (GiB):99      Used (GiB):99      Free (GiB):0

Comment 9 Humble Chirammal 2017-06-16 11:08:02 UTC
On thing to note here : the subjected VG only has :

Id:ad555009def9baae0ba65b24a5afbdc8   Name:/dev/sde            State:online    Size (GiB):99      Used (GiB):98      Free (GiB):1

Comment 10 Humble Chirammal 2017-06-16 11:09:34 UTC
(In reply to Humble Chirammal from comment #9)
> On thing to note here : the subjected VG only has :
> 
> Id:ad555009def9baae0ba65b24a5afbdc8   Name:/dev/sde            State:online 
> Size (GiB):99      Used (GiB):98      Free (GiB):1

How many volumes exist in this cluster? It would be also appreciated if you can attach lvm ( pvs,vgs,lvs..etc) outputs ?

Comment 19 Humble Chirammal 2017-08-03 04:49:20 UTC
I agree to move this out of CNS release considering the complexity of the fix and by the fact that this is a bug from 0-day.

Comment 27 Raghavendra Talur 2018-07-06 20:30:38 UTC
The cause of this bug is db inconsistency between heketi and LVM's VG free space. Heketi considered free space on a device more than the actual free space. 

With the current builds, this inconsistency about free space on device should not happen. However, test case is little difficult because such inconsistency was introduced only when 
1. gluster did not respond after volume creation
2. heketi could not perform kubeexec after volume creation

Possible way to hit such bug: while creating and deleting PVCs in a loop, reboot OCP master node


I will add more deterministic test case if I find one. Please start with one suggested above.

Comment 34 Anjana KD 2018-09-07 18:37:18 UTC
Thankyou John, I have updated based on the feedback given.

Comment 35 Anjana KD 2018-09-07 18:40:08 UTC
Thankyou John, I have updated based on the feedback given.

Comment 36 John Mulligan 2018-09-07 19:04:41 UTC
Doc Text looks OK

Comment 38 errata-xmlrpc 2018-09-12 09:22:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2686


Note You need to log in before you can comment on or make changes to this bug.