|Summary:||OpenShift Heapster is logging a lot of "no pod" or "no container" found messages|
|Product:||OpenShift Container Platform||Reporter:||Christian Stark <cstark>|
|Component:||Monitoring||Assignee:||Frederic Branczyk <fbranczy>|
|Status:||CLOSED ERRATA||QA Contact:||Junqi Zhao <juzhao>|
|Version:||3.9.0||CC:||alegrand, andcosta, anpicker, aos-bugs, cstark, ddelcian, erooth, fshaikh, hgomes, jforrest, jrosenta, maupadhy, mloibl, openshift-bugs-escalate, pdwyer, pkanthal, pkrupa, pyates, rbdiri, rgudimet, rhowe, rsandu, rsunog, surbania, vlaad|
|Fixed In Version:||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|Last Closed:||2019-08-26 16:27:38 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Christian Stark 2018-06-21 09:26:46 UTC
Description of problem: Customer being on 3.7.44 faces the issue that the logs are full of messages like: 2018-06-19T14:11:42.817885000Z I0619 14:11:42.813407 1 handlers.go:215] No metrics for pod iis-tu/context-cache-3990787087-xhbdw 2018-06-19T14:11:49.217479000Z I0619 14:11:49.216839 1 handlers.go:215] No metrics for pod iis-tu/context-cache-3990787087-xhbdw 2018-06-19T14:11:54.231636000Z I0619 14:11:54.226316 1 handlers.go:215] No metrics for pod iis-tu/context-cache-3990787087-xhbdw I0618 11:19:51.922817 1 handlers.go:264] No metrics for container mongodb-backup in pod assistify/mongodb-backup-1528314300-dxl6l I0618 12:01:41.487533 1 handlers.go:264] No metrics for container docker-build in pod rim-eu/rim-eu-roter-stable-105-build I0618 12:01:41.487473 1 handlers.go:264] No metrics for container sti-build in pod wd-dev/doc-3-build I0618 12:19:08.011249 1 handlers.go:264] No metrics for container aggregated-cms-tools-job-s0a6q in pod ecm-eu/aggregated-cms-tools-job- While the same message (for the same pod or container) gets printed 12000 times (often but not always every 5 seconds), customer has altogether >2 million entries like this over 2 days. The pods are short running and are completed or terminated before the error occurs. In the logs I found some occurrences of https://bugzilla.redhat.com/show_bug.cgi?id=1539830 which should be fixed with latest errata but there are 25 entries... No further issues seen in the logs, also he uses standard settings. Version-Release number of selected component (if applicable): [computer@ocp-ansible]$ oc version oc v3.7.44 kubernetes v1.7.6+a08f5eeb62 features: Basic-Auth GSSAPI Kerberos SPNEGO How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: lots of log entries as described above Expected results: no unnecessary logs anymore no log entries Additional info: checked further erratas but found no relevant bug
Comment 2 Solly Ross 2018-06-21 15:15:55 UTC
Someone's trying to access Heapster (either via an HPA or the dashboard), and Heapster is logging an error message when it couldn't find any metrics for that container/pod. If the pod is short-running or terminated, that's somewhat expected behavior to see that log message. We could kill the log message, but it can be a useful debug tool if you know what you're looking for. Does the customer know what's requesting the metrics?
Comment 17 Ruben Vargas Palma 2019-02-20 17:24:18 UTC
*** Bug 1636453 has been marked as a duplicate of this bug. ***
Comment 20 Ryan Howe 2019-02-21 14:20:27 UTC
As a workaround with the log issue, you can set the log level for heapster to 1 or 0. This will stop you from seeing the log "No metrics for container %s in pod %s/%s". To set the log level add `--v=1` to the replication controller for heapster under template.spec.containers.command
Comment 21 ravig 2019-04-02 18:46:52 UTC
@Fatima @Christian, Can you update if the workaround suggested by Ryan is ok or do you want to go with the code change I suggested?I am planning to close this bug, if we don't have any update within a week.
Comment 22 Fatima 2019-04-03 00:54:15 UTC
Hi Ravi, It seems like the case was closed automatically because customer did not respond. Last comment on the case was by a Red Hatter (most probably a SA for that cu), that the cu will try the suggested workaround. You can close this bug after Christian's approval ;) Thanks, Fatima
Comment 23 Andre Costa 2019-04-03 07:40:07 UTC
Hi, I don't think we can close this ticket. I have been working with the customer and that workaround is not working. Customer has updated the cluster and they are now on OCP v3.9. Thank you
Comment 24 ravig 2019-04-03 18:39:50 UTC
Andre, Can you provide the heapster logs when log-level has been set to 1?
Comment 25 Andre Costa 2019-04-04 07:12:53 UTC
Ravig, I asked the updated logs from heapster.
Comment 27 Andre Costa 2019-04-10 08:51:48 UTC
Hi, I'm updating with the recent logs from heapster.
Comment 30 Seth Jennings 2019-06-18 13:52:44 UTC
*** Bug 1720246 has been marked as a duplicate of this bug. ***
Comment 34 Frederic Branczyk 2019-07-02 12:32:25 UTC
Moving to modified as the PR to configure the log verbosity has been merged. https://github.com/openshift/openshift-ansible/pull/11735 With it customers can set the log verbosity to 0 (default 1) if they don't want to see these messages.
Comment 36 Junqi Zhao 2019-08-20 11:23:00 UTC
openshift_metrics_heapster_log_verbosity ansible parameter is added, default value is 1. set openshift_metrics_heapster_log_verbosity=0 could reduce the verbose # rpm -qa | grep ansible openshift-ansible-docs-3.9.99-1.git.0.6d3f661.el7.noarch openshift-ansible-roles-3.9.99-1.git.0.6d3f661.el7.noarch openshift-ansible-playbooks-3.9.99-1.git.0.6d3f661.el7.noarch openshift-ansible-3.9.99-1.git.0.6d3f661.el7.noarch ansible-184.108.40.206-1.el7ae.noarch
Comment 38 errata-xmlrpc 2019-08-26 16:27:38 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2550