Red Hat Bugzilla – Bug 1474099
30+ minutes for metrics to re-stabilize after heapster restart @ 15K pods
Last modified: 2017-08-22 10:06:37 EDT
Description of problem:
- metrics are stable at 15K pods - no push failures and no missing metrics.
Retrieving metrics via the Hawkular REST API and there are no empty buckets
- stop heapster, wait a few minutes and verify the most recent buckets are empty
- start heapster. Watch heapster logs for push failures and check metrics for missing data
It takes 23 minutes for any data to push to Hawkular->Cassandra again and 30 minutes before pushes completely stabilize and no holes appear in metrics data
Version-Release number of selected component (if applicable): Metrics container version 3.6.152
How reproducible: Always when restarting heapster with a large number of pods. I have not determined the threshold
Steps to Reproduce:
1. Deploy metrics
2. Start 15K pods and verify metrics are being collected for all pods - stable metrics with no errors in the heapster logs
3. Stop heapster, verify metrics are no longer collected
4. Start heapster, watch the heapster error logs for " Failed to push data to sink: Hawkular-Metrics Sink" to stop occurring
5. Watch the OpenShift UI or Hawkular REST API for metrics collection to successfully restart
30 + minutes for metrics to be collected in a stable fashion
Heapster collects metrics and pushes them to Hawkular + Cassandra within 1 or 2 intervals of restarting.
Additional info: Heapster logs attached. Let me know what further info is required. The configuration is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1465532
Created attachment 1303352 [details]
heapster log after restart with 15K pods running on 100 nodes
Heapster will try and and update the metric definitions when the server starts, this means reading metric definitions from Hawkular Metrics and potentially writing back updates.
Is this something new in 3.6 or something we also experienced in 3.5?
Were any other updates made to the default configurations for Heapster or Hawkular Metrics? Could you please attach the output of 'oc get pods -n openshift-infra -o yaml'
@miburman: any thoughts here?
Not sure about 3.5 - we'd have to set up an environment to test that.
The only Heapster configuration modification is to remove the memory limit.
I hit an instance today where heapster seems never to reconnect. Re-testing that now. Let me know if there is any extra logging or other info you want to see.
Going to try deploying 3.5 in this cluster and attempt to reproduce.
Also, restoring the needinfo for miburman that I inadvertently cleared (@miburman see comment 2)
There's probably too big backlog of updates to the definitions that can't be handled (or takes too much time in other words). This is the same effect as when there's too many new pods. We haven't made changes to it between 3.5 and 3.6 in Heapster side, so the behavior should be the same in 3.5. That is, if we would see the same behavior in normal operations.
Correction comment 3 where I said it never reconnects. That is false - it did eventually pick up and started pushing metrics -> hawkular->Cassandra
I deployed metrics 3.5 in the same cluster in the exact same state (15K running pods) and metrics started showing up within 2 minutes and by 12 minutes after heapster start all 5 of the 2-minute buckets I was requesting had good data.
I re-deployed 3.6 and saw the same behavior - no metrics data for an extended period of time after restarting heapster. Understand the frustration with nothing
changing in those areas between releases - reporting what's happening.
We have reproducers of both the good/3.5 and reported behaviors on this 100 node AWS cluster. Let us know if we can get any additional data.
Can we run a mixture of 3.5 & 3.6 images? If that's possible, a combination of 3.5 Heapster + 3.6 Hawkular-Metrics and 3.6 Heapster + 3.5 Hawkular-Metrics would be very interesting.
Maybe the problem is somewhere else in the Heapster and not in anywhere we're currently looking at (we've been concentrating on the sink<->Hawkular-Metrics integration, but perhaps that's not the biggest pain).
Would the compression job startup right after the pods are restarted as well?
(In reply to Michael Burman from comment #8)
> Can we run a mixture of 3.5 & 3.6 images? If that's possible, a combination
> of 3.5 Heapster + 3.6 Hawkular-Metrics and 3.6 Heapster + 3.5
> Hawkular-Metrics would be very interesting.
> Maybe the problem is somewhere else in the Heapster and not in anywhere
> we're currently looking at (we've been concentrating on the
> sink<->Hawkular-Metrics integration, but perhaps that's not the biggest
I am going to try this, will update the bug.
(In reply to Matt Wringe from comment #9)
> Would the compression job startup right after the pods are restarted as well?
Compression is disabled on this cluster.
(In reply to Michael Burman from comment #8)
Just tried with 3.5 version of heapster, result is same. It took around 32 mins for metrics to get stable.
(In reply to Vikas Laad from comment #12)
Just tried with 3.5 version of heapster, result is same. It took around 32 mins for metrics to get stable. All the other components installed are 3.6 only heapster rc was updated to use 3.5 image and scaled.
The 3.5 version of Cassandra was also tried and it still had the same effect.
At this point it looking like its a change in Hawkular Metrics which has caused this regression.
We did make some changes which could cause performance to degrade, but this was a required change to prevent metric tags data from getting lost. We are investigating if this is cause or not to this issue.
This is not a regression, I deployed metrics few times on 3.5 and compared with 3.6 performance. I do not see any difference, its almost same time 24 secs to start loading metrics. This is still a performance issue and needs to be fixed.
(In reply to Vikas Laad from comment #15)
correction: 24 mins to start loading metrics.
There's an improvement in the Heapster's master version for this. I'll make a backport for our current Heapster version also and submit to origin-metrics for testing.
(In reply to Vikas Laad from comment #16)
> (In reply to Vikas Laad from comment #15)
> correction: 24 mins to start loading metrics.
You got my hopes up :)
This is expected behavior though, as we have to refresh all the tags in the Hawkular-Metrics when the restart happens (the initial caching only takes care of the _system tenant caching).
The only solutions are a) Heapster should request metric definitions for all active projects, b) speedup tags updates on the Hawkular-Metrics side. The a) requires that we would somehow know all the active projects in Heapster / we would maintain yet another cache of projects seen and request caches from Hawkular-Metrics when seen first time. I'm not sure if the latter would speed up the loading that much though.
With the latest image provided im bz https://bugzilla.redhat.com/show_bug.cgi?id=1465532 this time has been reduced to 16 mins for 15K pods.