Would like to disable specific messages in the event logs: For example would like to be able to suppress this message: Mar 14 18:09:29 sd-2d4e-3c2e atomic-openshift-node: E0314 18:09:29.258706 44414 kubelet_network.go:191] checkLimitsForResolvConf: Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains!
1. Edit the fluentd configmap e.g. oc edit cm logging-fluentd 2. In the fluent.conf key, add the following just inside the @INGRESS label: <label @INGRESS> ## filters <filter **> @type grep <exclude> key MESSAGE pattern checkLimitsForResolvConf: Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains! </exclude> <exclude> key message pattern checkLimitsForResolvConf: Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains! </exclude> <exclude> key log pattern checkLimitsForResolvConf: Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains! </exclude> </filter> Not sure about the format of "pattern" but you get the gist. The actual text could be in the MESSAGE, message, or log field, depending on the source. If you know the message will only come from the journal, you could optimize this somewhat: <filter journal> @type grep <exclude> key MESSAGE pattern checkLimitsForResolvConf: Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains! </exclude> </filter> See https://docs.fluentd.org/v0.12/articles/filter_grep for more details
1. Proposed title of this feature request Make checkLimitsForResolvConf event messages configurable 3. What is the nature and description of the request? Event messages from checkLimitsForResolvConf are polluting both system and project logs. The request is to add a configuration to turn those event messages off. 4. Why does the customer need this? (List the business requirements here) The event message spam is impacting both OCP admins and projects running on OpenShift, making it hard to determine if log entries are actionable or not. Its also a general problem for applications being onboarded into OpenShift. 5. How would the customer like to achieve this? (List the functional requirements here) - Add a configuration to disable certain event messages at the cluster level and at the project level. 6. For each functional requirement listed in question 5, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented. - Change the requested configuration - Confirm that the event messages no longer occur 7. Is there already an existing RFE upstream or in Red Hat bugzilla? No 8. Does the customer have any specific timeline dependencies? As soon as possible. 9. Is the sales team involved in this request and do they have any additional input? No 10. List any affected packages or components. OpenShift 11. Would the customer be able to assist in testing this functionality if implemented? Yes
Moving to Pod team as it seems like the node agent is emitting them... I agree that we should probably limit the amount of events sent from these components.
Peter - I believe you have been investigating areas that openshift can reduce the log output
See also https://bugzilla.redhat.com/show_bug.cgi?id=1555057.
https://github.com/kubernetes/kubernetes/pull/64860 - This is the PR created upstrream
https://github.com/openshift/origin/pull/20070 - This is the PR created against origin.
1. Reboot (master or node) or restart network service, then update /etc/resolv.conf to include more than 3 doamins. (e.g. search cluster.local ec2.internal kube-system.svc.cluster.local svc.cluster.local cluster.local kubelet.kubernetes.rancher.internal kubernetes.rancher.internal rancher.internal c.rancher-qa.internal google.internal ) 2. Check if overmuch lines of CheckLimitsForResolvConf error generated # journalctl --since="1 hour ago"| grep -i CheckLimitsForResolvConf ... Aug 01 03:39:34 ip-172-18-11-220.ec2.internal atomic-openshift-node[2110]: E0801 03:39:34.059287 2110 dns.go:180] CheckLimitsForResolvConf: Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains! Aug 01 03:39:34 ip-172-18-11-220.ec2.internal atomic-openshift-node[2110]: I0801 03:39:34.059835 2110 server.go:285] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-172-18-11-220.ec2.internal", UID:"ip-172-18-11-220.ec2.internal", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'CheckLimitsForResolvConf' Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains! ... Issue verified to be fixed on v3.11.0-0.11.0, no issue above reproduced [root@ip-172-18-4-172 ~]# systemctl status network ● network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled) Active: active (exited) since Thu 2018-08-09 01:33:48 EDT; 1h 17min ago Docs: man:systemd-sysv-generator(8) Aug 09 01:33:47 ip-172-18-4-172.ec2.internal systemd[1]: Starting LSB: Bring up/down networking... Aug 09 01:33:48 ip-172-18-4-172.ec2.internal network[729]: Bringing up loopback interface: [ OK ] Aug 09 01:33:48 ip-172-18-4-172.ec2.internal network[729]: Bringing up interface eth0: [ OK ] Aug 09 01:33:48 ip-172-18-4-172.ec2.internal systemd[1]: Started LSB: Bring up/down networking. [root@ip-172-18-4-172 ~]# systemctl restart network [root@ip-172-18-4-172 ~]# vim /etc/resolv.conf [root@ip-172-18-4-172 ~]# journalctl --since="1 hour ago"| grep -i CheckLimitsForResolvConf [root@ip-172-18-4-172 ~]# journalctl --since="1 hour ago"| grep -i CheckLimitsForResolvConf [root@ip-172-18-4-172 ~]# oc version oc v3.11.0-0.11.0 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-4-172.ec2.internal:8443 openshift v3.11.0-0.11.0 kubernetes v1.11.0+d4cacc0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days