Bug 1419492

Summary: Sanitize OpenShift Logging
Product: OpenShift Container Platform Reporter: Marko Myllynen <myllynen>
Component: NodeAssignee: Derek Carr <decarr>
Status: CLOSED WONTFIX QA Contact: Xiaoli Tian <xtian>
Severity: low Docs Contact:
Priority: medium    
Version: 3.4.0CC: anli, aos-bugs, eparis, gblomqui, jokerman, jswensso, mmccomas, pdwyer, pportant
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-03 15:27:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marko Myllynen 2017-02-06 10:43:28 UTC
Description of problem:
After installing OCP 3.4 I see lots of unhelpful messages being logged in system log, for example this is being logged every 10 seconds (!):

Feb  6 10:24:35 infra01 atomic-openshift-node: I0204 10:24:35.999002    4959 conversion.go:133] failed to handle multiple devices for container. Skipping Filesystem stats

This comes up also quite often:

Feb  6 11:04:13 infra01 systemd: Scope libcontainer-32749-systemd-test-default-dependencies.scope has no PIDs. Refusing.

And also:

Feb  6 11:07:20 node01 atomic-openshift-node: I0204 11:07:20.453234   23617 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/f427dfb5-eb44-11e6-a750-525400dbbcb2-hawkular-metrics-client-secrets" (spec.Name: "hawkular-metrics-client-secrets") to pod "f427dfb5-eb44-11e6-a750-525400dbbcb2" (UID: "f427dfb5-eb44-11e6-a750-525400dbbcb2"). Volume is already mounted to pod, but remount was requested.

And still:

Feb  6 11:03:15 infra01 atomic-openshift-node: I0204 11:03:15.030854   23867 node_auth.go:143] Node request attributes: namespace=, user=&user.DefaultInfo{Name:"system:serviceaccount:openshift-infra:heapster", UID:"f67c12c5-eb44-11e6-a750-525400dbbcb2", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-infra", "system:authenticated", "system:authenticated"}, Extra:map[string][]string{}}, attrs=authorizer.DefaultAuthorizationAttributes{Verb:"create", APIVersion:"v1", APIGroup:"", Resource:"nodes/stats", ResourceName:"infra01.example.com", RequestAttributes:interface {}(nil), NonResourceURL:false, URL:"/stats/container/"}
Feb  6 11:03:15 infra01 atomic-openshift-node: I0204 11:03:15.045958   23867 server.go:971] POST /stats/container/: (15.285129ms) 200 [[Go-http-client/1.1] 10.1.1.4:55070]

These probably  should not be logged on default log level (debug might be more appropriate) and it might be also considered to use a dedicated log file for these.

There might be other cases as well which I didn't come across now, would be great to have OCP logging reviewed and sanitized in general. If done in upstream already, then please consider backporting to RHEL 7 / OCP 3.x.

It could be some of the messages, like the systemd one above, are not coming directly from OCP components but a holistic view would helpful here so that after installation an idle system should not log much.

Please adjust the BZ Component as needed.

Version-Release number of selected component (if applicable):
OCP 3.4

Comment 2 Greg Blomquist 2019-07-03 15:27:29 UTC
No activity in 2 years on a low severity bug