Description of problem: Booting master-controllers always panic and crash with "panic: cannot find suitable quantum" atomic-openshift-master-controllers[56139]: panic: cannot find suitable quantum for -0.01666666753590107 atomic-openshift-master-controllers[56139]: goroutine 200 [running]: atomic-openshift-master-controllers[56139]: panic(0x36107a0, 0xc8296a9220) atomic-openshift-master-controllers[56139]: /usr/lib/golang/src/runtime/panic.go:481 +0x3e6 fp=0xc829691a08 sp=0xc829691988 atomic-openshift-master-controllers[56139]: github.com/openshift/origin/vendor/github.com/juju/ratelimit.NewBucketWithRate(0xbf91111120000000, 0xfffffffffffffffe, 0x40) atomic-openshift-master-controllers[56139]: /builddir/build/BUILD/atomic-openshift-git-0.d7bd9b6/_output/local/go/src/github.com/openshift/origin/vendor/github.com/juju/ratelimit/ratelimit.go:64 +0x150 fp=0xc829691a70 sp=0xc829691a08 atomic-openshift-master-controllers[56139]: github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/flowcontrol.NewTokenBucketRateLimiter(0xbc888889, 0xfffffffffffffffe, 0x0, 0x0) atomic-openshift-master-controllers[56139]: /builddir/build/BUILD/atomic-openshift-git-0.d7bd9b6/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/flowcontrol/throttle.go:50 +0x3d fp=0xc829691aa8 sp=0xc829691a70 atomic-openshift-master-controllers[56139]: github.com/openshift/origin/pkg/cmd/server/origin.(*MasterConfig).RunImageImportController(0xc820be4000) atomic-openshift-master-controllers[56139]: /builddir/build/BUILD/atomic-openshift-git-0.d7bd9b6/_output/local/go/src/github.com/openshift/origin/pkg/cmd/server/origin/run_components.go:401 +0xa2 fp=0xc829691b80 sp=0xc829691aa8 atomic-openshift-master-controllers[56139]: github.com/openshift/origin/pkg/cmd/server/start.startControllers(0xc820be4000, 0xc820a8b9e0, 0x0, 0x0) atomic-openshift-master-controllers[56139]: /builddir/build/BUILD/atomic-openshift-git-0.d7bd9b6/_output/local/go/src/github.com/openshift/origin/pkg/cmd/server/start/start_master.go:697 +0x1763 fp=0xc829691ef0 sp=0xc829691b80 atomic-openshift-master-controllers[56139]: github.com/openshift/origin/pkg/cmd/server/start.(*Master).Start.func1(0xc820be4000, 0xc820a8b9e0) atomic-openshift-master-controllers[56139]: /builddir/build/BUILD/atomic-openshift-git-0.d7bd9b6/_output/local/go/src/github.com/openshift/origin/pkg/cmd/server/start/start_master.go:415 +0x3d fp=0xc829691f70 sp=0xc829691ef0 Version-Release number of selected component (if applicable): atomic-openshift-3.3.0.35-1.git.0.d7bd9b6.el7.x86_64 How reproducible: Always in customer env Steps to Reproduce: 1. Start atomic-openshift-master-controllers 2. 3. Actual results: master-controllers gets panic and crash with "panic: cannot find suitable quantum" Expected results: Boot normally Additional info: Upstream code: https://github.com/juju/ratelimit/blob/master/ratelimit.go#L64
Per the discussion, this is an issue with ImportRateLimiter: flowcontrol.NewTokenBucketRateLimiter [1] and setting maxScheduledImageImportsPerMinute: -1 imagePolicyConfig: disableScheduledImport: false maxImagesBulkImportedPerRepository: 1 maxScheduledImageImportsPerMinute: -1 scheduledImageImportMinimumIntervalSeconds: 60 Based on this line from the panic; atomic-openshift-master-controllers[56139]: /builddir/build/BUILD/atomic-openshift-git-0.d7bd9b6/_output/local/go/src/github.com/openshift/origin/pkg/cmd/server/origin/run_components.go:401 +0xa2 fp=0xc829691b80 sp=0xc829691aa8 [1] https://github.com/openshift/origin/blob/master/pkg/cmd/server/origin/run_components.go#L408
It seems it's broken for a long time. I tested the v1.2.1 [1] and it also contains this error. [1] https://github.com/openshift/origin/releases/tag/v1.2.1
Adding UpcomingRelease as this is not a blocker (regression). We still have to fix this but I won't block the release on this.
Michal - there is a workaround for this IIRC - set the max to a very high value. If it won't be fixed (why? too risky?) we should at least get a known issues doc or kb for this.
(In reply to Paul Weil from comment #4) > Michal - there is a workaround for this IIRC - set the max to a very high > value. If it won't be fixed (why? too risky?) we should at least get a > known issues doc or kb for this. I think you can set disableScheduledImport: true in case you want to disable the scheduled import? I think we can validate this and only allow positive values, but we will have to backport this.
-1 is supposed to mean unlimited (https://docs.openshift.org/latest/install_config/master_node_configuration.html). Disabling isn't going to help in this situation. But setting it to something like 5000 would effectively give unlimited and avoid the -1 bug.
I think the solution Paul proposed (use very high number to get "unlimited" behavior seems correct). Would that be sufficient to close this bug?
No according to the current doc, the -1 value should work, so at least we need to fix the doc if we won't fix the code.
(In reply to Takayoshi Kimura from comment #8) > No according to the current doc, the -1 value should work, so at least we > need to fix the doc if we won't fix the code. I would rather update the documentation. Thanks!
Docs PR: https://github.com/openshift/openshift-docs/pull/3638
(we should validate the -1 and refuse it in validation, only positive value allowed).
PR: https://github.com/openshift/origin/pull/13315
The doc changed is ok. For the validation for value -1, I used the devenv-rhel7_6073 image, when setting 'maxScheduledImageImportsPerMinute: -1', the openshift start successfully, and no panics any more, does this the correct result?
Yes, should be disabling rate limiting with that value.
Does this bug will be fixed into OCP 3.3? If so, I will verify it with 3.3 puddle.
This is only for 3.6 and will not be backported. Current workaround is still to set a high number on previous releases.
Waiting for 3.6 new build to verify, change the status to modified.
This has been merged into ocp and is in OCP v3.6.27 or newer.
Verified in OCP 3.6.27. # openshift version openshift v3.6.27 kubernetes v1.5.2+43a9be4 etcd 3.1.0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716