Bug 1502850 - [DOCKER] Remove oci-umount spam
Summary: [DOCKER] Remove oci-umount spam
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.8.0
Assignee: Vivek Goyal
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-16 20:27 UTC by Eric Paris
Modified: 2018-03-28 14:07 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-28 14:07:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nom nom nom oci-umount spam, so yummy! spam and rice! (2.18 KB, text/plain)
2017-10-16 21:50 UTC, Eric Paris
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1502846 0 unspecified CLOSED [DOCKER] Log spam must be greatly reduced 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2018:0489 0 None None None 2018-03-28 14:07:53 UTC

Internal Links: 1502846

Description Eric Paris 2017-10-16 20:27:32 UTC
oci-umount-1.12.6-48.git0fdc778.el7.x86_64

When using docker there is lots of log spam about 'Failed to canonicalize path' and other such things. (See attached for specific example).

The large rate of (useless) log messages makes aggregated logging fail at scale. Please remove this spam.

Comment 1 Eric Paris 2017-10-16 21:50:32 UTC
Created attachment 1339481 [details]
nom nom nom oci-umount spam, so yummy! spam and rice!

Comment 5 Daniel Walsh 2017-10-18 17:08:26 UTC
Yes we have reduced some of the noice in oci-umount, but mainly moving stuff down from info to debug.  Not sure if it will be still too noisy for OpenShift.

Eric the messages you are showing are now at debug level.  We could further drop them bu modifying the /etc/oci-umount.conf file to only contain the overlay link.  Rather then all possible links.

Comment 6 Eric Paris 2017-10-19 16:25:21 UTC
Why am I needinfo? The container team should be capable of running containers with oci-umount and seeing if the messages show up in syslog/journal. If yes, fix that. If no, no need to ask me   :)

Comment 7 Daniel Walsh 2017-10-19 16:44:17 UTC
Eric I asked you about if moving to debug level is good enough, or should we customize for OpenShift only to indicate the expected leak points.

We currently ship with



/var/lib/docker/overlay2
/var/lib/docker/overlay
/var/lib/docker/devicemapper
/var/lib/docker/containers/*
/var/lib/docker-latest/overlay2
/var/lib/docker-latest/overlay
/var/lib/docker-latest/devicemapper
/var/lib/docker-latest/containers/*
/var/lib/containers/storage/lvm
/var/lib/containers/storage/devicemapper
/var/lib/containers/storage/overlay
/var/run/containers/storage

New version would allow you to specify only 
/var/lib/docker/overlay2
/var/lib/docker/containers/*


Which would get rid of most of spam around skipping mount points that don't exist.

Comment 8 Eric Paris 2017-10-19 16:47:30 UTC
I don't know what setting it to debug did. If it shows up in syslog/journal it isn't good enough. If it isn't, it is good enough.

customizing the config doesn't seem right either. why would anyone want these messages in /var/log/messages or journal?

Comment 11 Vivek Goyal 2017-10-23 12:34:18 UTC
(In reply to Eric Paris from comment #8)
> I don't know what setting it to debug did. If it shows up in syslog/journal
> it isn't good enough. If it isn't, it is good enough.
> 
> customizing the config doesn't seem right either. why would anyone want
> these messages in /var/log/messages or journal?

/etc/oci-umount.conf contains list of paths which should be unmounted from inside the container. If that path does not exist inside container, what should oci-umount do. I see following two options.

A. Log a debug message and continue.
B. Don't log anything and continue.

Currently we are doing option A. And looks like you are recommending option B.

My concern with option B is that if user does a typo then he has nothing in messages which says user specified path could not be unmounted. 

From a design point of view, I think option B is not very good.

Problem is currently our /etc/oci-umount.conf is very big and contains lots
of static paths which might not be a mount point inside container at all.

One option is to break down /etc/oci-umount.conf and ship paths in individual packages. So that if crio is not installed, we don't bother about these paths.

If that's still not good enough, I guess we can reduce these messages. I am afraid that few days later somebody will open another bug saying, hey my path is not valid and oci-umount silently ignored it and did not complain.

Comment 14 Eric Paris 2017-10-23 14:12:56 UTC
Another option might be a 'debug' mode for oci-umount which emits these messages, but which defaults to off?

I strongly choose B over A. But if you believe A is important lets make a debug mode flag somewhere, please.

Comment 15 Daniel Walsh 2017-10-23 20:27:31 UTC
Well I guess I would ask why are you grabbing all "DEBUG" messages.  Rather then all messages at Info or above.

Personally I think B is fine, and wait til someone complains.   But I believe you guys are over complaining about something that the logging tool should handle.  Don't collect DEBUG Messages by default.

Comment 16 Peter Portante 2017-10-24 01:02:09 UTC
(In reply to Daniel Walsh from comment #15)
> Well I guess I would ask why are you grabbing all "DEBUG" messages.  Rather
> then all messages at Info or above.

The logging collectors today don't collect DEBUG messages when using reading from journald [1], because we have the metadata the tells us the messages are "DEBUG" messages.

However, the version of logging used when this occurred was collecting from /var/log/messages, where RHEL defaults for rsyslog strip out valuable metadata when writing to a file.  The "log level" is one of the pieces that is stripped.  Commit [2] changes aggregated logging to always collect from journald instead.

It is my opinion that the aggregated logging system should never choose to exclude data collected unless directed by the user (see BZ 1505602 [3]).

[1] https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/configs.d/openshift/filter-exclude-journal-debug.conf
[2] https://github.com/openshift/origin-aggregated-logging/commit/829725d65b7139e79373b49d84c3333597287b3b
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1505602

> 
> Personally I think B is fine, and wait til someone complains.   But I
> believe you guys are over complaining about something that the logging tool
> should handle.  Don't collect DEBUG Messages by default.

I don't believe any software we provide should exclude and drop data on the floor unless the administrator opts-in to such behavior.

Just because a message has the metadata of DEBUG on it does not mean all such messages should fall in the same category, because such ascribed logging levels are uniformly understood to be the same.

Data loss should be avoided, no?

Hence, BZ 1505602 [3] was filed.

Comment 17 Daniel Walsh 2017-10-24 16:20:56 UTC
Most logging systems I know of only record at Info and above by default.  If the admin chose to query at Debug then they should query at Debug.  But then complaining about the volume of debug messages seems a little crazy.  Lots of tools are noisy at Debug level.  Now it would be nice to have an easy mechansim to turn debug on and off in oci-umount, but right now that does not exists.

Comment 18 Peter Portante 2017-10-24 17:28:07 UTC
(In reply to Daniel Walsh from comment #17)
> Most logging systems I know of only record at Info and above by default.

Are you referring to the likes of journald and rsyslog?  If you are, then those are not aggregated logging systems, are they?

Ins't aggregated logging way too late in the game to be trying to filter messages automatically, as there are potentially many heterogenous systems for which such a notion does not uniformly apply?

We all seem to agree that local logging systems like journald and rsyslog should have a way to filter at a logging level defined and supported in their environment.  No question.

> If the admin chose to query at Debug then they should query at Debug.

Sure, this is easily accomplished in the likes of Kibana, not so easy when that metadatas is stripped, written by rsyslog to /var/log/messages, and the scraped by the aggregated logging system.

> But then complaining about the volume of debug messages seems a little crazy.
> Lots of tools are noisy at Debug level.

First off all, depending on how the debug messages are handled the volume of logs can become a performance problem.  Nothing is for free, everything costs us. Today, because of RHEL defaults, we only have so much space in journald in /run/log, so this volume of log messages pushes out other data shortening the total amount of logs we can keep around.

Second, we, as productizers, need to be responsible to avoid causing our customers, and our own support teams, headaches with bad behaviors in our product.  Emitting debug messages for every container invocation across an entire cluster is an explosion of unwanted data.  The notion of what patterns match and don't match is known outside the container, so it can be determined once per host, logged properly at a warning level, and avoid billions of debug messages and associated costs.

Comment 19 Vivek Goyal 2017-10-27 19:29:58 UTC
Ok, I am thinking of making log level configuration. May be a new file /usr/share/oci-umount/oci-umount-options.conf

log_level=LOG_INFO

log_level will be configurable. By default logs of level INFO and higher importance will be logged. If user is interested in debug level logs, they
can change to log_level=LOG_DEBUG

By default oci-umount will parse the file present in /usr/share. User can override system shipped settings by putting a file /etc/oci-umount/oci-umount-options.conf

Will implement this soon. Any thoughts?

Comment 20 Daniel Walsh 2017-10-30 21:01:06 UTC
Personally I would prefer to get the log level or debug flag from the runtime.  Rather then hacking up yet another place for an admin to edit.  
If not that then I would prefer to put this into the oci-umount.conf file rather then having two conf files.

If you add 

log-level=LOG_INFO 
We can have th tool check for paths beginning with / otherwise we treat them as name value pairs.  Having two config file makes no sense.

Comment 21 Vivek Goyal 2017-10-31 15:43:52 UTC
Had conversation with Dan and he convinced me that this should not be done per hook, rather all hooks should follow log_level as desired by user of container runtime.

So we need some generic changes in runc and possibly spec. Once that happens changing oci-umount should be simple.

Dan has opened upstream conversation here.

https://github.com/opencontainers/runc/issues/1628

Comment 24 Vivek Goyal 2017-11-03 18:18:01 UTC
Ok, a generic fix might take a long time to land (if it ever happens). So for now I am proposing a fix for oci-umount. Provide a new option "log_level" to control the logging level of oci-umount. By default it is set to LOG_INFO. That means debug messages are not sent to journal. User can override it.

I have opened a PR for review here.

https://github.com/projectatomic/oci-umount/pull/27

Comment 26 DeShuai Ma 2018-01-04 09:12:08 UTC
# rpm -qa |grep oci-umount
oci-umount-2.3.0-1.git51e7c50.el7.x86_64

# cat /usr/share/oci-umount/oci-umount-options.conf
# Options for oci-umount hook

# This file is in ini format. Currently only one section named "options"
# is supported.
#
# If you want to override any options, don't edit this file. Instead
# create a new file /etc/oci-umount/oci-umount-options.conf and override
# options there.
#
# log_level determins what logs of sent to syslog. Log levels are same as
# described in syslog(3). Logs of same priority as log_level and higher
# priority will be sent to syslog and rest will be ignored. By default
# logs of level LOG_INFO and lower priority (higher importance) are sent
# to syslog and LOG_DEBUG is not logged. If one wants to enable debug
# level logs, specify log_level=LOG_DEBUG
[options]
log_level=LOG_INFO

Comment 29 errata-xmlrpc 2018-03-28 14:07:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489


Note You need to log in before you can comment on or make changes to this bug.