Bug 1468420 - LoadBalancerRR: Removing log spam
LoadBalancerRR: Removing log spam
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity medium
: ---
: 3.7.0
Assigned To: Dan Williams
Meng Bo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-06 23:16 EDT by Eric Paris
Modified: 2017-11-28 17:00 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: We logged useless information, often, and at log level 2. Consequence: The logs fill with junk. Fix: Reduce the times we log to those when there are actual changes. Result: Far fewer messages, and only when they are relevant.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-28 17:00:15 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Eric Paris 2017-07-06 23:16:45 EDT
Over 2 minutes I see log messages like:

LoadBalancerRR: Removing endpoints for ops-health-monitoring/pull-07062050z-ie:8080-tcp

11 times. This leaves me wondering what the message really means. Why does it need to be removed every 10 seconds? Shouldn't it be removed 1 time and be gone? Why do we keep printing this over and over?

If this is not actually a problem or telling me anything interesting I consider it spam and should be at a higher level than V(2). But I want to understand what it is actually telling me and fix any bug that it might be indicating before we just hide it.
Comment 2 Dan Williams 2017-07-07 12:01:17 EDT
(In reply to Eric Paris from comment #0)
> Over 2 minutes I see log messages like:
> 
> LoadBalancerRR: Removing endpoints for
> ops-health-monitoring/pull-07062050z-ie:8080-tcp
> 
> 11 times. This leaves me wondering what the message really means. Why does
> it need to be removed every 10 seconds? Shouldn't it be removed 1 time and
> be gone? Why do we keep printing this over and over?
> 
> If this is not actually a problem or telling me anything interesting I
> consider it spam and should be at a higher level than V(2). But I want to
> understand what it is actually telling me and fix any bug that it might be
> indicating before we just hide it.

It's not telling you anything interesting unless you want to debug the proxy.  We should hide it.  It likely gets printed every time there's an endpoints or service update.
Comment 3 Eric Paris 2017-07-07 16:09:44 EDT
If it is really normal to print that we are removing the SAME thing every 10 seconds that doesn't belong at V(2). Now if we actually add or remove a service that probably should be at V(2). Tagging for 3.7.0 to get it off any 3.6 block list mess.
Comment 4 Dan Williams 2017-07-07 16:49:55 EDT
Upstream PR: https://github.com/kubernetes/kubernetes/pull/48635

Note our Origin PR will be different, since upstream has changed.  But it's literally just:

diff --git a/vendor/k8s.io/kubernetes/pkg/proxy/userspace/roundrobin.go b/vendor/k8s.io/kubernetes/pkg/proxy/userspace/roundrobin.go
index 5d58c3c..f81635b 100644
--- a/vendor/k8s.io/kubernetes/pkg/proxy/userspace/roundrobin.go
+++ b/vendor/k8s.io/kubernetes/pkg/proxy/userspace/roundrobin.go
@@ -299,10 +299,12 @@ func (lb *LoadBalancerRR) OnEndpointsUpdate(allEndpoints []*api.Endpoints) {
        // Remove endpoints missing from the update.
        for k := range lb.services {
                if _, exists := registeredEndpoints[k]; !exists {
-                       glog.V(2).Infof("LoadBalancerRR: Removing endpoints for %s", k)
                        // Reset but don't delete.
                        state := lb.services[k]
-                       state.endpoints = []string{}
+                       if len(state.endpoints) > 0 {
+                               glog.V(2).Infof("LoadBalancerRR: Removing endpoints for %s", k)
+                               state.endpoints = []string{}
+                       }
                        state.index = 0
                        state.affinity.affinityMap = map[string]*affinityState{}
                }
Comment 5 Dan Williams 2017-07-12 11:29:15 EDT
Origin PR: https://github.com/openshift/origin/pull/15166
Comment 6 Meng Bo 2017-07-13 23:23:36 EDT
ose/cb1f3d352b1ed29291086221c06ae07c7f8bb5cf has been merged in to v3.6.144.

Move the bug to ON_QA for testing.
Comment 7 Yan Du 2017-07-17 23:29:30 EDT
Test on latest OCP env, there is no such spam log shown in node log.
openshift v3.6.152.0
kubernetes v1.6.1+5115d708d7
Comment 10 errata-xmlrpc 2017-11-28 17:00:15 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.