1559834 – Created 219 volumes of 1gb each but space utilized is greater than 1Tb

Bug 1559834 - Created 219 volumes of 1gb each but space utilized is greater than 1Tb

Summary: Created 219 volumes of 1gb each but space utilized is greater than 1Tb

Keywords:
Status:	CLOSED DUPLICATE of bug 1554467
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	heketi
Sub Component:
Version:	cns-3.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Michael Adam
QA Contact:	Rachael
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	OCS-3.11.1-devel-triage-done
TreeView+	depends on / blocked

Reported:	2018-03-23 11:41 UTC by Rachael
Modified:	2019-01-23 20:28 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-01-23 20:28:50 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Rachael 2018-03-23 11:41:10 UTC

Description of problem:
A script was run to create 300 PVCs of 1Gb each. Each node has two devices one of 1Tb and the other of 50Gb. After 219 volumes the heketi logs show no space error. Heketi topology info also shows 0 free space across all devices on all the three nodes.


Version-Release number of selected component (if applicable):
rhgs-volmanager-rhel7   v3.9.0

How reproducible:

Actual results:
PVC creation should be successful

Expected results:
PVC creation fails due to no space

Comment 2 Rachael 2018-03-23 11:51:34 UTC

Logs are available here: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1559834/log/

Comment 3 Rachael 2018-03-26 13:58:40 UTC

Updated the logs with heketi.db and script used for pvc creation:  http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1559834/log/

Comment 4 Raghavendra Talur 2018-03-27 08:52:39 UTC

Looking through the heketi logs, I see

$ grep "Started async operation: Create Volume" heketi.log  | wc -l
1062
$ grep "Started POST /volumes" heketi.log   | wc -l
21624

I think it means, there are 1062 requests that are accepted by heketi for volume creation. Also, the number of requests that have reached negroni for volume create OR volume expand are 21624. It is either the case that so many PVC requests were made or the openshift storage provisioner requested so many as retry mechanism.

I need logs from provisioner to debug further.

Comment 5 Raghavendra Talur 2018-03-27 13:26:53 UTC

oc describe pod heketi.....


  Type     Reason         Age                  From                                        Message                     
  ----     ------         ----                 ----                                        -------                     
  Warning  InspectFailed  1h (x19 over 5h)     kubelet, dhcp47-160.lab.eng.blr.redhat.com  Failed to inspect image "rhgs3/rhgs-volmanager-rhel7:v3.9.0": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning  Unhealthy      59m (x380 over 23h)  kubelet, dhcp47-160.lab.eng.blr.redhat.com  Liveness probe failed: Get http://10.129.0.8:8080/hello: dial tcp 10.129.0.8:8080: getsockopt: connection refused
  Warning  Failed         38m (x55 over 6h)    kubelet, dhcp47-160.lab.eng.blr.redhat.com  Error: context deadline exceeded
  Normal   Pulled         34m (x257 over 23h)  kubelet, dhcp47-160.lab.eng.blr.redhat.com  Container image "rhgs3/rhgs-volmanager-rhel7:v3.9.0" already present on machine
  Normal   Killing        13m (x353 over 23h)  kubelet, dhcp47-160.lab.eng.blr.redhat.com  Killing container with id docker://heketi:Container failed liveness probe.. Container will be killed and recreated.
  Warning  Unhealthy      9m (x2760 over 23h)  kubelet, dhcp47-160.lab.eng.blr.redhat.com  Readiness probe failed: Get http://10.129.0.8:8080/hello: dial tcp 10.129.0.8:8080: getsockopt: connection refused
  Normal   Created        3m (x194 over 23h)   kubelet, dhcp47-160.lab.eng.blr.redhat.com  Created container

Comment 6 Rachael 2018-03-27 13:36:47 UTC

Logs from provisioner added to   http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1559834/log/

Comment 19 Raghavendra Talur 2019-01-23 20:28:50 UTC


*** This bug has been marked as a duplicate of bug 1554467 ***

Note You need to log in before you can comment on or make changes to this bug.