Bug 1593634 - OpenShift Heapster is logging a lot of "no pod" or "no container" found messages
Summary: OpenShift Heapster is logging a lot of "no pod" or "no container" found messages
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 3.9.z
Assignee: Frederic Branczyk
QA Contact: Junqi Zhao
URL:
Whiteboard:
: 1636453 1720246 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-21 09:26 UTC by Christian Stark
Modified: 2023-09-15 00:10 UTC (History)
25 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-26 16:27:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
heapster_logs (7.04 KB, text/plain)
2019-04-10 08:51 UTC, Andre Costa
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1636453 0 unspecified CLOSED unable to get metrics for resource cpu: no metrics returned from heapster 2021-12-10 17:48:25 UTC
Red Hat Product Errata RHBA-2019:2550 0 None None None 2019-08-26 16:27:42 UTC

Internal Links: 1636453

Description Christian Stark 2018-06-21 09:26:46 UTC
Description of problem:


Customer being on 3.7.44 faces the issue that the logs are full of messages like:

2018-06-19T14:11:42.817885000Z I0619 14:11:42.813407       1 handlers.go:215] No metrics for pod iis-tu/context-cache-3990787087-xhbdw
2018-06-19T14:11:49.217479000Z I0619 14:11:49.216839       1 handlers.go:215] No metrics for pod iis-tu/context-cache-3990787087-xhbdw
2018-06-19T14:11:54.231636000Z I0619 14:11:54.226316       1 handlers.go:215] No metrics for pod iis-tu/context-cache-3990787087-xhbdw

I0618 11:19:51.922817 1 handlers.go:264] No metrics for container mongodb-backup in pod assistify/mongodb-backup-1528314300-dxl6l
I0618 12:01:41.487533 1 handlers.go:264] No metrics for container docker-build in pod rim-eu/rim-eu-roter-stable-105-build
I0618 12:01:41.487473 1 handlers.go:264] No metrics for container sti-build in pod wd-dev/doc-3-build
I0618 12:19:08.011249 1 handlers.go:264] No metrics for container aggregated-cms-tools-job-s0a6q in pod ecm-eu/aggregated-cms-tools-job-


While the same message (for the same pod or container) gets printed 12000 times (often but not always every 5 seconds), customer has altogether >2 million entries like this over 2 days.

The pods are short running and are completed or terminated before the error occurs.


In the logs I found some occurrences of
https://bugzilla.redhat.com/show_bug.cgi?id=1539830
which should be fixed with latest errata but there are 25 entries...

No further issues seen in the logs, also he uses standard settings.


Version-Release number of selected component (if applicable):

[computer@ocp-ansible]$ oc version

oc v3.7.44

kubernetes v1.7.6+a08f5eeb62

features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

lots of log entries as described above

Expected results:

no unnecessary logs anymore

no log entries

Additional info:

checked further erratas but found no relevant bug

Comment 2 Solly Ross 2018-06-21 15:15:55 UTC
Someone's trying to access Heapster (either via an HPA or the dashboard), and Heapster is logging an error message when it couldn't find any metrics for that container/pod.  If the pod is short-running or terminated, that's somewhat expected behavior to see that log message.  We could kill the log message, but it can be a useful debug tool if you know what you're looking for.  Does the customer know what's requesting the metrics?

Comment 17 Ruben Vargas Palma 2019-02-20 17:24:18 UTC
*** Bug 1636453 has been marked as a duplicate of this bug. ***

Comment 20 Ryan Howe 2019-02-21 14:20:27 UTC
As a workaround with the log issue, you can set the log level for heapster to 1 or 0. This will stop you from seeing the log "No metrics for container %s in pod %s/%s"[1].

To set the log level add `--v=1` to the replication controller for heapster under template.spec.containers.command

Comment 21 ravig 2019-04-02 18:46:52 UTC
@Fatima @Christian,

Can you update if the workaround suggested by Ryan is ok or do you want to go with the code change I suggested?I am planning to close this bug, if we don't have any update within a week.

Comment 22 Fatima 2019-04-03 00:54:15 UTC
Hi Ravi,

It seems like the case was closed automatically because customer did not respond.

Last comment on the case was by a Red Hatter (most probably a SA for that cu), that the cu will try the suggested workaround.

You can close this bug after Christian's approval ;)

Thanks,
Fatima

Comment 23 Andre Costa 2019-04-03 07:40:07 UTC
Hi,

I don't think we can close this ticket. I have been working with the customer and that workaround is not working.

Customer has updated the cluster and they are now on OCP v3.9.

Thank you

Comment 24 ravig 2019-04-03 18:39:50 UTC
Andre,

Can you provide the heapster logs when log-level has been set to 1?

Comment 25 Andre Costa 2019-04-04 07:12:53 UTC
Ravig,

I asked the updated logs from heapster.

Comment 26 Andre Costa 2019-04-10 08:51:12 UTC
Created attachment 1554127 [details]
heapster_logs

Comment 27 Andre Costa 2019-04-10 08:51:48 UTC
Hi,

I'm updating with the recent logs from heapster.

Comment 30 Seth Jennings 2019-06-18 13:52:44 UTC
*** Bug 1720246 has been marked as a duplicate of this bug. ***

Comment 34 Frederic Branczyk 2019-07-02 12:32:25 UTC
Moving to modified as the PR to configure the log verbosity has been merged. https://github.com/openshift/openshift-ansible/pull/11735

With it customers can set the log verbosity to 0 (default 1) if they don't want to see these messages.

Comment 36 Junqi Zhao 2019-08-20 11:23:00 UTC
openshift_metrics_heapster_log_verbosity ansible parameter is added, default value is 1.
set openshift_metrics_heapster_log_verbosity=0 could reduce the verbose

# rpm -qa | grep ansible
openshift-ansible-docs-3.9.99-1.git.0.6d3f661.el7.noarch
openshift-ansible-roles-3.9.99-1.git.0.6d3f661.el7.noarch
openshift-ansible-playbooks-3.9.99-1.git.0.6d3f661.el7.noarch
openshift-ansible-3.9.99-1.git.0.6d3f661.el7.noarch
ansible-2.4.6.0-1.el7ae.noarch

Comment 38 errata-xmlrpc 2019-08-26 16:27:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2550

Comment 39 Red Hat Bugzilla 2023-09-15 00:10:05 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.