Bug 1609360 - [Tracker-OCP-BZ#1613781] scalability issue at external dynamic prov
Summary: [Tracker-OCP-BZ#1613781] scalability issue at external dynamic prov
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhgs-gluster-block-prov-container
Version: cns-3.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: CNS 3.10
Assignee: Humble Chirammal
QA Contact: Prasanth
URL:
Whiteboard: aos-scalability-311
Depends On: 1613781
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-27 17:58 UTC by Hongkai Liu
Modified: 2023-09-14 04:32 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-27 18:50:18 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1600160 0 unspecified CLOSED scalability issue with gluster file on OCP 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1613781 0 high CLOSED scalability issue with external provisioner controller on leader election and api throttling. 2023-09-14 04:32:53 UTC

Internal Links: 1600160 1613781

Description Hongkai Liu 2018-07-27 17:58:36 UTC
Description of problem:
PVC provision is slow and the time spent on each PVC provision is unpredictable.

Version-Release number of selected component (if applicable):
# oc get pod -n glusterfs -o yaml | grep "image:" | sort -u
      image: registry.reg-aws.openshift.com:443/rhgs3/rhgs-gluster-block-prov-rhel7:3.3.1-20
      image: registry.reg-aws.openshift.com:443/rhgs3/rhgs-server-rhel7:3.3.1-27
      image: registry.reg-aws.openshift.com:443/rhgs3/rhgs-volmanager-rhel7:3.3.1-21

# yum list installed | grep openshift
atomic-openshift.x86_64       3.10.18-1.git.0.13dc4a0.el7

How reproducible: Always


Steps to Reproduce:
1. Create 250 PVC (either gluster.file or gluster.block)
2. Measure the number of bound PV each minute

Actual results:
Both cases haven been shown John Mulligan by a demo.

Case 1: If the block hosting volume does not exist, then it stuck with 0.

Case 2: If the block hosting volume exists, then it stuck with 3 for a long time (usually) and then stuck at 199 finally.
http://file.rdu.redhat.com/~hongkliu/test_result/bz1600160/20180726.longer.2/

Expected results:
1000 pods with gluster-block PVS are running.

Additional info:

Comment 2 John Mulligan 2018-07-28 15:22:58 UTC
I was also able to produce a similar lack of timely block pvc satisfaction on a cluster I set up independently.

I requested >100 PVCs via:
for i in $(seq 0 100); do sed s/PVCNAME/block-1-$i/ bc.yaml | oc apply -f- ; done

There's an initial burst of activity in the block provisioner logs and then activity decreases and does not appear to resume. Heketi logs indicate that it is not getting any requests for new block volumes.

The logging from the block provisioner mainly appears to log lines like:
I0728 14:27:56.953340       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-20[c06a4991-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953352       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-33[c2759f71-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953365       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-26[c1525238-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953377       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-43[c4598d59-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953387       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-78[ca3d61dc-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953400       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-64[c8048cb3-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953411       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-84[cb4a2c2a-926f-11e8-b008-5254005f433e]]
I0728 14:27:56.953422       1 controller.go:1167] scheduleOperation[lock-provision-cns/block-1-90[cc45933f-926f-11e8-b008-5254005f433e]]


I did some searching around and found the following links:
  https://github.com/kubernetes-csi/external-provisioner/issues/68

  https://github.com/kubernetes-incubator/external-storage/pull/837

  https://github.com/kubernetes-csi/external-provisioner/pull/104

  https://github.com/kubernetes-incubator/external-storage/pull/825

  https://github.com/kubernetes-incubator/external-storage/commit/126c9ffc8ef125460f6bd75d903ea1d26051b730#diff-634ce7794d379f3ba2119e2413074537


They indicate that there are potentially scalability issues with how the external provisioners communicate with the k8s api server. Suspiciously, one of the above links mentions "100 PVCs created at same time" which is quite similar to how our PVCs were created.


@Hongkai, I wonder if you could start with creating a smaller number of PVCs (say 30) and then increase the size of the batch incrementally to find out when the issue kicks in.


I also don't think that gluster-block is the right component for this. I don't think it's heketi and I couldn't find a specific component for the gluster block provisioner. I am pinging Humble so that he can suggest a better component (and to see this whole message in general).

Comment 3 Prasanna Kumar Kalever 2018-07-30 06:13:55 UTC
Thanks John for the analysis.



Hey Hongkai,
Will it be possible for you to attach the gluster-block logs ?

Comment 4 Humble Chirammal 2018-07-30 09:41:49 UTC
I have changed the component to rhgs-block-provisioner from gluster-block. 

I would like to share my thoughts on other technical details mentioned in comment#2:

Yes, there are known limitations or issues wrt leader election and api server throttling in external controller, in that sense gluster-block-provisioner is not the correct component of the bugzilla or the fix has to come in controller. I can track the bug accordingly.  

Back to the issues mentioned here:

There are 2 issues mentioned in this setup

------snip--------------------

Case 1: If the block hosting volume does not exist, then it stuck with 0.

Case 2: If the block hosting volume exists, then it stuck with 3 for a long time (usually) and then stuck at 199 finally.
http://file.rdu.redhat.com/~hongkliu/test_result/bz1600160/20180726.longer.2/

-----/snip--


Case 1 : This is tracked under another bugzilla, so discarding it for now.


Case 2: I still dont know whats going wrong with the mentioned PVCs. 

Hongkai, can you please mention the pvc name of 3 and 199 ?

Comment 7 Hongkai Liu 2018-07-30 12:45:35 UTC
@Prasanna,

our test cluster is terminated on daily basis. But i can redo this today.
Are the logs in the /var/log/ forlder on the nodes where glusterfs pods are running?

@Humble

the pods and the PVCs are created by a python script. After that, I check the number of bound PVCs, the rest is at Pending. I do not know which one exactly it stuck at.

@John,

I will do the batch as 30, 40, 50, ,..., 100, 110 today and update the results here.

Comment 8 Humble Chirammal 2018-07-30 13:07:57 UTC
(In reply to Hongkai Liu from comment #7)

> @Humble
> 
> the pods and the PVCs are created by a python script. After that, I check
> the number of bound PVCs, the rest is at Pending. I do not know which one
> exactly it stuck at.

hmmm..I had gone through the provisioner logs and got a doubt which claim to chase for. thats why I was looking for PVCs which is in pending. 

If you are reproducing it again while can you please use below image ?

brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-gluster-block-prov-rhel7:3.3.1-22

Comment 10 John Mulligan 2018-07-30 18:10:23 UTC
(In reply to Humble Chirammal from comment #4)
> I have changed the component to rhgs-block-provisioner from gluster-block. 
> 
> I would like to share my thoughts on other technical details mentioned in
> comment#2:
> 
> Yes, there are known limitations or issues wrt leader election and api
> server throttling in external controller, in that sense
> gluster-block-provisioner is not the correct component of the bugzilla or
> the fix has to come in controller. I can track the bug accordingly.  
> 
> Back to the issues mentioned here:
> 
> There are 2 issues mentioned in this setup
> 
> ------snip--------------------
> 
> Case 1: If the block hosting volume does not exist, then it stuck with 0.
> 
> Case 2: If the block hosting volume exists, then it stuck with 3 for a long
> time (usually) and then stuck at 199 finally.
> http://file.rdu.redhat.com/~hongkliu/test_result/bz1600160/20180726.longer.2/
> 
> -----/snip--
> 
> 
> Case 1 : This is tracked under another bugzilla, so discarding it for now.
> 

Humble, could you please include the bug #. I was unaware of an existing bug for this.

> 
> Case 2: I still dont know whats going wrong with the mentioned PVCs. 
> 
> Hongkai, can you please mention the pvc name of 3 and 199 ?


I want to be very careful with this last "item" as when I looked over it live it appeared that these were in fact two separate issues. 
* The "slow start" where the provisioner seems to sit there doing nothing a long while before provisioning volumes
* Failures to provision any volumes after the 199th.

The first case I what I wrote about in my comment #2. 

The 2nd case _is_ a Heketi bug, bz#1596018, which has a fix implemented.
 I want to make sure we avoid mixing the two problems. I am certain about the this latter part because the Heketi logs match the behavior seen for that bug.

For me the main mystery was the "slow start" and that's what I looked into after my meeting with Hongkai.

Comment 11 Humble Chirammal 2018-07-31 05:52:05 UTC
(In reply to John Mulligan from comment #10)

> > 
> > 
> > Case 1 : This is tracked under another bugzilla, so discarding it for now.
> > 
> 
> Humble, could you please include the bug #. I was unaware of an existing bug
> for this.

I was confused with one of the existing bug and just realized its different. Can we open a new bug against heketi on this issue ?

> 
> > 
> > Case 2: I still dont know whats going wrong with the mentioned PVCs. 
> > 
> > Hongkai, can you please mention the pvc name of 3 and 199 ?
> 
> 
> I want to be very careful with this last "item" as when I looked over it
> live it appeared that these were in fact two separate issues. 

> * The "slow start" where the provisioner seems to sit there doing nothing a
> long while before provisioning volumes
> * Failures to provision any volumes after the 199th.


> 
> The first case I what I wrote about in my comment #2. 

Yes, its what I am trying to mitigate and waiting for the result after testing latest block provisioner image.

> 
> The 2nd case _is_ a Heketi bug, bz#1596018, which has a fix implemented.
>  I want to make sure we avoid mixing the two problems. I am certain about
> the this latter part because the Heketi logs match the behavior seen for
> that bug.
> 
> For me the main mystery was the "slow start" and that's what I looked into
> after my meeting with Hongkai.

Sure, thanks for clarifying the details based on your experience looking at Hongkai's setup.

Comment 19 Humble Chirammal 2018-08-01 11:52:31 UTC
Fixed in version : 	rhgs-gluster-block-prov-container-3.3.1-23

Comment 32 Humble Chirammal 2018-08-24 03:34:42 UTC
Hongkai, I have built gluster block provisioner with the external controller library mentioned at https://bugzilla.redhat.com/show_bug.cgi?id=1613781#c5. This supposed to improve the situation of scalability of gluster block provisioning. The resulted image is brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-gluster-block-prov-rhel7:3.4.0-2

Do you have a setup to try this image and share the result ? 

Eventhough we agreed to defer this bug from OCS 3.10 release, I would like to see the result with above mentioned image before doing so.:)

I am moving this bug to ON_QA.

Comment 33 Hongkai Liu 2018-08-24 11:34:46 UTC
Hi Humble,
thanks for the update.
Our OCP cluster is created/terminated daily basis.
I will do the test today.

Comment 44 Red Hat Bugzilla 2023-09-14 04:32:16 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.