Bug 1804178

Summary: remove excessive logging
Product: OpenShift Container Platform Reporter: Dan Winship <danw>
Component: NetworkingAssignee: Dan Winship <danw>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: anusaxen
Version: 4.4   
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1804745 (view as bug list) Environment:
Last Closed: 2020-05-13 21:59:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1804745    

Description Dan Winship 2020-02-18 12:17:25 UTC
While trying to figure out the "missing service endpoints" bug that eventually turned out to be an upstream kubernetes informer problem, we added a bug of logging to our proxy code:

occasional very large logs like:

I0218 04:52:19.052809    3509 proxy.go:368] sdn proxy: add svc openshift-controller-manager/controller-manager: &Service{ObjectMeta:{controller-manager  openshift-controller-manager /api/v1/namespaces/openshift-controller-manager/services/controller-manager e2ec59dd-5479-4fb8-8d6a-6621dc072a29 4861 0 2020-02-18 04:42:06 +0000 UTC <nil> <nil> map[] map[prometheus.io/scheme:https prometheus.io/scrape:true service.alpha.openshift.io/serving-cert-secret-name:serving-cert service.alpha.openshift.io/serving-cert-signed-by:openshift-service-serving-signer@1582000927 service.beta.openshift.io/serving-cert-signed-by:openshift-service-serving-signer@1582000927] [] []  []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:https,Protocol:TCP,Port:443,TargetPort:{0 8443 },NodePort:0,},},Selector:map[string]string{controller-manager: true,},ClusterIP:172.30.208.251,Type:ClusterIP,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:0,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamily:nil,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},},}

and very frequent small logs like:

I0218 04:52:21.713330    3509 proxy.go:334] hybrid proxy: syncProxyRules start
I0218 04:52:21.909927    3509 proxy.go:337] hybrid proxy: mainProxy.syncProxyRules complete
I0218 04:52:21.997270    3509 proxy.go:340] hybrid proxy: unidlingProxy.syncProxyRules complete

These logs should go away.

(Note that this will not get rid of:

I0218 04:52:21.997231    3509 proxier.go:371] userspace proxy: processing 0 service events
I0218 04:52:21.997258    3509 proxier.go:350] userspace syncProxyRules took 87.303349ms

because those come from upstream.)

Comment 3 Anurag saxena 2020-02-18 18:33:59 UTC
Thanks Dan but what would happen to tests like https://bugzilla.redhat.com/show_bug.cgi?id=1792060#c0. In tests like enableUnidling CNO flag, hybrid proxy logs populating/not-populating are the only way to check such feature functionality unless another bug should be filed to support explicit logging for enableUnidling, say.

Comment 4 Dan Winship 2020-02-18 19:01:29 UTC
The "userspace" logs will also go away when the hybrid proxier is disabled

Comment 5 Anurag saxena 2020-02-19 20:43:11 UTC
@Dan Winship, i still see hybrid proxier logs and other logs on Feb 19 build - 4.4.0-0.nightly-2020-02-19-080151 which is a next build after your merged PR. Anything i am missing?

$ oc logs sdn-6nbkf | grep -i "hybrid proxy"
I0219 20:24:01.496566    2440 proxy.go:334] hybrid proxy: syncProxyRules start
I0219 20:24:01.634710    2440 proxy.go:337] hybrid proxy: mainProxy.syncProxyRules complete
I0219 20:24:01.693145    2440 proxy.go:340] hybrid proxy: unidlingProxy.syncProxyRules complete
I0219 20:24:06.077511    2440 proxy.go:334] hybrid proxy: syncProxyRules start
I0219 20:24:06.234285    2440 proxy.go:337] hybrid proxy: mainProxy.syncProxyRules complete
I0219 20:24:06.332004    2440 proxy.go:340] hybrid proxy: unidlingProxy.syncProxyRules complete
I0219 20:24:19.252986    2440 proxy.go:334] hybrid proxy: syncProxyRules start
I0219 20:24:19.381674    2440 proxy.go:337] hybrid proxy: mainProxy.syncProxyRules complete
I0219 20:24:19.437710    2440 proxy.go:340] hybrid proxy: unidlingProxy.syncProxyRules complete
I0219 20:24:19.437724    2440 proxy.go:334] hybrid proxy: syncProxyRules start
I0219 20:24:19.586667    2440 proxy.go:337] hybrid proxy: mainProxy.syncProxyRules complete
I0219 20:24:19.642343    2440 proxy.go:340] hybrid proxy: unidlingProxy.syncProxyRules complete

-
-
-
-

$ oc logs sdn-6nbkf | grep -i "add svc openshift-controller-manager"
I0219 20:23:57.078334    2440 proxy.go:368] sdn proxy: add svc openshift-controller-manager-operator/metrics: &Service{ObjectMeta:{metrics  openshift-controller-manager-operator /api/v1/namespaces/openshift-controller-manager-operator/services/metrics 57e3516c-6742-41df-b8cb-a56fcf0ff741 4199 0 2020-02-19 20:11:20 +0000 UTC <nil> <nil> map[app:openshift-controller-manager-operator] map[exclude.release.openshift.io/internal-openshift-hosted:true service.alpha.openshift.io/serving-cert-secret-name:openshift-controller-manager-operator-serving-cert service.alpha.openshift.io/serving-cert-signed-by:openshift-service-serving-signer@1582143333 service.beta.openshift.io/serving-cert-signed-by:openshift-service-serving-signer@1582143333] [] []  []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:https,Protocol:TCP,Port:443,TargetPort:{0 8443 },NodePort:0,},},Selector:map[string]string{app: openshift-controller-manager-operator,},ClusterIP:172.30.237.223,Type:ClusterIP,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:0,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamily:nil,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},},}

Comment 6 Dan Winship 2020-02-19 23:37:21 UTC
oh, they're still logged at V(4). Maybe that's what we use by default in CI...

Comment 7 Anurag saxena 2020-02-20 00:20:25 UTC
(In reply to Dan Winship from comment #6)
> oh, they're still logged at V(4). Maybe that's what we use by default in
> CI...

Dan, you mean this bug target V(6)? not V(4)?

Comment 8 Dan Winship 2020-02-20 14:42:09 UTC
Previously, the messages were logged with V(2). Now they are logged with V(4). This means that in an ordinary cluster they will no longer be displayed. If CI is running with --loglevel 4 then they'd still be displayed in CI.

Comment 9 Anurag saxena 2020-02-20 18:38:00 UTC
Thank Dan for clarifying about the testing steps offline. I still see logs present as mentioned under comment5 on very latest nightly 4.4.0-0.nightly-2020-02-20-122510 which is 4 build post your merged PR build

Steps taken

1) Found its V(2)

$ oc get pod sdn-r4g7z -n openshift-sdn -o yaml | grep -i "\-v="
      exec /usr/bin/openshift-sdn-node --proxy-config=/config/kube-proxy-config.yaml --v=${DEBUG_LOGLEVEL:-2}   <<<<<<

2) SDN Images "from oc get pod sdn-r4g7z -n openshift-sdn -o yaml"

Containers:
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c08a091d1f41bdce7a307a80941f9031819af60cdfd767ba36e216bbaadcb675
initContainers:
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6eb2b120cd259221b4930f648fb711388704384825ee12e41e97186d14e65106

Comment 10 Dan Winship 2020-02-24 13:03:00 UTC
Apparently new SDN images have not been built since https://github.com/openshift/sdn/pull/108 merged. Presumably Casey is getting automated mail about this or something...

Comment 11 zhaozhanqi 2020-02-25 10:16:01 UTC
I still can reproduce this issue on 4.4.0-0.nightly-2020-02-24-020331
oc logs sdn-4qzsz -n openshift-sdn | grep proxy | more
I0224 11:10:03.983513    5321 cmd.go:123] Reading proxy configuration from /config/kube-proxy-config.yaml
I0224 11:10:03.988934    5321 cmd.go:227] Watching config file /config/kube-proxy-config.yaml for changes
I0224 11:10:03.988974    5321 cmd.go:227] Watching config file /config/..2020_02_24_07_20_20.891190060/kube-proxy-config.yaml for changes
I0224 11:10:05.573070    5321 proxy.go:103] Using unidling+iptables Proxier.
I0224 11:10:05.900544    5321 proxy.go:136] Tearing down userspace rules.
I0224 11:10:06.013176    5321 proxier.go:218] Setting proxy IP to 10.0.98.194 and initializing iptables
I0224 11:10:06.093176    5321 proxy.go:95] Starting multitenant SDN proxy endpoint filter
I0224 11:10:06.199208    5321 proxy.go:234] Started Kubernetes Proxy on 0.0.0.0
I0224 11:10:06.199240    5321 cmd.go:171] openshift-sdn network plugin waiting for proxy startup to complete
I0224 11:10:06.219642    5321 proxy.go:368] sdn proxy: add svc openshift-cluster-version/cluster-version-operator: &Service{ObjectMeta:{cluster-version-operator  openshift-cluster-version /api/v1/namespaces/open
shift-cluster-version/services/cluster-version-operator 3aeb3451-8f37-484b-9537-42145c95fde0 7 0 2020-02-24 07:18:44 +0000 UTC <nil> <nil> map[k8s-app:cluster-version-operator] map[exclude.release.openshift.io/i
nternal-openshift-hosted:true] [] []  []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:metrics,Protocol:TCP,Port:9099,TargetPort:{0 9099 },NodePort:0,},},Selector:map[string]string{k8s-app: cluster-vers
ion-operator,},ClusterIP:172.30.122.58,Type:ClusterIP,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:0,PublishNotReadyAdd
resses:false,SessionAffinityConfig:nil,IPFamily:nil,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},},}
I0224 11:10:06.227028    5321 proxy.go:368] sdn proxy: add svc openshift-kube-controller-manager/kube-controller-manager: &Service{ObjectMeta:{kube-controller-manager  openshift-kube-controller-manager /api/v1/n
amespaces/openshift-kube-controller-manager/services/kube-controller-manager 3bc77dd3-cc6e-4964-b915-e359c3beeac5 4106 0 2020-02-24 07:23:01 +0000 UTC <nil> <nil> map[] map[prometheus.io/scheme:https prometheus.
io/scrape:true service.alpha.openshift.io/serving-cert-secret-name:serving-cert service.alpha.openshift.io/serving-cert-signed-by:openshift-service-serving-signer@1582528976 service.beta.openshift.io/serving-cer
t-signed-by:openshift-service-serving-signer@1582528976] [] []  []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:https,Protocol:TCP,Port:443,TargetPort:{0 10257 },NodePort:0,},},Selector:map[string]stri
ng{kube-controller-manager: true,},ClusterIP:172.30.47.201,Type:ClusterIP,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:
0,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamily:nil,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},},}
I0224 11:10:06.227126    5321 proxy.go:368] sdn proxy: add svc openshift-marketplace/certified-operators: &Service{ObjectMeta:{certified-operators  openshift-marketplace /api/v1/namespaces/openshift-marketplace/
services/certified-operators a6a934ca-996d-49fc-babc-c29c72e75aac 13064 0 2020-02-24 07:33:32 +0000 UTC <nil> <nil> map[opsrc-owner-name:certified-operators opsrc-owner-namespace:openshift-marketplace] map[] [] 
[]  []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:grpc,Protocol:TCP,Port:50051,TargetPort:{0 50051 },NodePort:0,},},Selector:map[string]string{marketplace.operatorSource: certified-operators,},Cluste
rIP:172.30.113.6,Type:ClusterIP,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:0,PublishNotReadyAddresses:false,SessionAf
finityConfig:nil,IPFamily:nil,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},},}
I0224 11:10:06.227775    5321 proxy.go:368] sdn proxy: add svc openshift-multus/multus-admission-controller-monitor-service: &Service{ObjectMeta:{multus-admission-controller-monitor-service  openshift-multus /ap
i/v1/namespaces/openshift-multus/services/multus-admission-controller-monitor-service 808dc753-fafb-4724-89e4-962a2634bcb2 1832 0 2020-02-24 07:20:09 +0000 UTC <nil> <nil> map[name:multus-admission-controller-mo
nitor-service] map[] [{operator.openshift.io/v1 Network cluster 6859eeed-af

Comment 12 Dan Winship 2020-03-02 13:06:04 UTC
ART problems are fixed and there should be new SDN builds with the reduced logging now.

Comment 13 Anurag saxena 2020-03-02 19:53:09 UTC
Thanks Dan. This looks good on latest 4.4 nightly

# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-03-02-173524   True        False         7m30s   Cluster version is 4.4.0-0.nightly-2020-03-02-173524

# oc get pod sdn-cq4pc -n openshift-sdn -o yaml | grep -i "\-v="
      exec /usr/bin/openshift-sdn-node --proxy-config=/config/kube-proxy-config.yaml --v=${DEBUG_LOGLEVEL:-2}

# oc logs sdn-cq4pc | grep -i "sdn proxy:|hybrid"
#

Comment 15 errata-xmlrpc 2020-05-13 21:59:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581