Bug 1391996
| Summary: | Openshift Metrics Heapster pod restarting when Openshift metrics configured to monitor many pods ( in this specific case 15k ) | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Elvir Kuric <ekuric> | ||||
| Component: | Hawkular | Assignee: | Matt Wringe <mwringe> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Peng Li <penli> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.3.1 | CC: | aos-bugs, jeder, pweil, snegrea, tstclair | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.7.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | aos-scalability-34 | ||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-08-04 15:47:56 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
*** This bug has been marked as a duplicate of bug 1465532 *** |
Created attachment 1217416 [details] metrics pods logs Description of problem: # oc get pods when executed in openshift-infra project gives output as # oc get pods NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-mp5gn 1/1 Running 0 2d hawkular-cassandra-2-z8rl0 1/1 Running 2 2d hawkular-metrics-2z5so 1/1 Running 0 2d hawkular-metrics-5srpo 1/1 Running 0 2d heapster-0npf8 1/1 Running 18 2d metrics-deployer-op83i 0/1 Completed 0 3d from where is visible that heapster pod was restarted many times for unknown reason. Version-Release number of selected component (if applicable): Openshiftrpm -qa | grep atomic atomic-openshift-dockerregistry-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-pod-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-clients-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-node-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-tests-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-clients-redistributable-3.3.1.1-1.git.0.629a1d8.el7.x86_64 tuned-profiles-atomic-openshift-node-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-master-3.3.1.1-1.git.0.629a1d8.el7.x86_64 tuned-profiles-atomic-2.7.1-3.el7.noarch atomic-openshift-3.3.1.1-1.git.0.629a1d8.el7.x86_64 atomic-openshift-sdn-ovs-3.3.1.1-1.git.0.629a1d8.el7.x86_64 and metrics images v.3.3 How reproducible: I have seen this issue when openshift metrics was supposed to monitor 15k pods across 220 nodes. Actual results: heapster pod fails Expected results: Additional info: log files for heapster / hawkular / cassandra attached to BZ