Bug 1419492

Summary:	Sanitize OpenShift Logging
Product:	OpenShift Container Platform	Reporter:	Marko Myllynen <myllynen>
Component:	Node	Assignee:	Derek Carr <decarr>
Status:	CLOSED WONTFIX	QA Contact:	Xiaoli Tian <xtian>
Severity:	low	Docs Contact:
Priority:	medium
Version:	3.4.0	CC:	anli, aos-bugs, eparis, gblomqui, jokerman, jswensso, mmccomas, pdwyer, pportant
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-07-03 15:27:29 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Marko Myllynen 2017-02-06 10:43:28 UTC

Description of problem:
After installing OCP 3.4 I see lots of unhelpful messages being logged in system log, for example this is being logged every 10 seconds (!):

Feb  6 10:24:35 infra01 atomic-openshift-node: I0204 10:24:35.999002    4959 conversion.go:133] failed to handle multiple devices for container. Skipping Filesystem stats

This comes up also quite often:

Feb  6 11:04:13 infra01 systemd: Scope libcontainer-32749-systemd-test-default-dependencies.scope has no PIDs. Refusing.

And also:

Feb  6 11:07:20 node01 atomic-openshift-node: I0204 11:07:20.453234   23617 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/f427dfb5-eb44-11e6-a750-525400dbbcb2-hawkular-metrics-client-secrets" (spec.Name: "hawkular-metrics-client-secrets") to pod "f427dfb5-eb44-11e6-a750-525400dbbcb2" (UID: "f427dfb5-eb44-11e6-a750-525400dbbcb2"). Volume is already mounted to pod, but remount was requested.

And still:

Feb  6 11:03:15 infra01 atomic-openshift-node: I0204 11:03:15.030854   23867 node_auth.go:143] Node request attributes: namespace=, user=&user.DefaultInfo{Name:"system:serviceaccount:openshift-infra:heapster", UID:"f67c12c5-eb44-11e6-a750-525400dbbcb2", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-infra", "system:authenticated", "system:authenticated"}, Extra:map[string][]string{}}, attrs=authorizer.DefaultAuthorizationAttributes{Verb:"create", APIVersion:"v1", APIGroup:"", Resource:"nodes/stats", ResourceName:"infra01.example.com", RequestAttributes:interface {}(nil), NonResourceURL:false, URL:"/stats/container/"}
Feb  6 11:03:15 infra01 atomic-openshift-node: I0204 11:03:15.045958   23867 server.go:971] POST /stats/container/: (15.285129ms) 200 [[Go-http-client/1.1] 10.1.1.4:55070]

These probably  should not be logged on default log level (debug might be more appropriate) and it might be also considered to use a dedicated log file for these.

There might be other cases as well which I didn't come across now, would be great to have OCP logging reviewed and sanitized in general. If done in upstream already, then please consider backporting to RHEL 7 / OCP 3.x.

It could be some of the messages, like the systemd one above, are not coming directly from OCP components but a holistic view would helpful here so that after installation an idle system should not log much.

Please adjust the BZ Component as needed.

Version-Release number of selected component (if applicable):
OCP 3.4

Comment 2 Greg Blomquist 2019-07-03 15:27:29 UTC

No activity in 2 years on a low severity bug