Bug 1885996

Summary: [Cloud Team - Cluster API Provider Azure] Logging is broken due to mix of k8s.io/klog v1 and v2
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Cloud ComputeAssignee: Joel Speed <jspeed>
Cloud Compute sub component: Other Providers QA Contact: sunzhaohua <zhsun>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, jhadvig, jokerman, sttts, yapei, zhsun
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-16 14:37:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1885151    
Bug Blocks:    

Description OpenShift BugZilla Robot 2020-10-07 12:24:33 UTC
+++ This bug was initially created as a clone of Bug #1885151 +++

+++ This bug was initially created as a clone of Bug #1883497 +++

We have klog throughout our controllers and because MAO depends on both klog v1 and v2, this means all of our components will need an update

+++ This bug was initially created as a clone of Bug #1883461 +++

k8s.io/klog moved to v2 in client-go and frieds. The operator does not use v2 yet and hence, we lose one half of the logging output (v2).

--- Additional comment from Stefan Schimanski on 2020-09-29 11:09:06 UTC ---

I did some tests using openshift-apiserver that uses kube component-base to add klog command line flags:

$ ./openshift-apiserver start --config asfd --v=4
I0929 13:04:18.354425   53628 cmd.go:57] v1 klog
I0929 13:04:18.354506   53628 cmd.go:58] v2 klog
I0929 13:04:18.354512   53628 cmd.go:61] v2 V(2) klog
I0929 13:04:18.354515   53628 cmd.go:64] v2 V(4) klog

So this means that we DO NOT set verbosity for v1 klog, and hence lose all the klogv1.V(2).Infof output.

--- Additional comment from Joel Speed on 2020-10-02 10:50:21 UTC ---

Need more PRs to revendor MAO now it's merged, back to Assigned

--- Additional comment from Joel Speed on 2020-10-05 09:06:53 UTC ---

One PR didn't make it into the 4.6 branch before the cutover, going to remove that PR from this Bug and create a clone for that particular PR

Comment 1 Joel Speed 2020-10-07 12:29:38 UTC
We can wait for 4.6.z for this

Comment 2 sunzhaohua 2020-11-04 03:03:59 UTC
This bug's PR is dev-approved and not yet merged, so I'm following DPTP-660 to do pre-merge verification by using cluster-bot to launch a cluster with the open PR.

clusterversion: 4.6.0-0.ci.test-2020-11-04-015242-ci-ln-z27ngxk
$ oc rsh -c machine-controller machine-api-controllers-745b86d794-5l858
sh-4.4$ ls
bin   etc   lib64			media  proc  sbin  termination-handler	var
boot  home  lost+found			mnt    root  srv   tmp
dev   lib   machine-controller-manager	opt    run   sys   usr
sh-4.4$ ./machine-controller-manager -v=4
I1104 02:40:31.022291      28 request.go:581] Throttling request took 2.339248044s, request: GET:https://172.30.0.1:443/apis/operators.coreos.com/v1alpha1?timeout=32s
F1104 02:40:31.027691      28 main.go:111] Failed to set up overall controller manager: error listening on :8081: listen tcp :8081: bind: address already in use
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0xc00012c001, 0xc00000a600, 0xa1, 0xf1)
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/klog/v2/klog.go:996 +0xb9
k8s.io/klog/v2.(*loggingT).output(0x27361c0, 0xc000000003, 0x0, 0x0, 0xc000166070, 0x2683f03, 0x7, 0x6f, 0x0)
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/klog/v2/klog.go:945 +0x191
k8s.io/klog/v2.(*loggingT).printf(0x27361c0, 0xc000000003, 0x0, 0x0, 0x1a54e93, 0x2f, 0xc000887da0, 0x1, 0x1)
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/klog/v2/klog.go:733 +0x17a
k8s.io/klog/v2.Fatalf(...)
	/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor/k8s.io/klog/v2/klog.go:1463
...

So this bug is pre-merge-verified. After the PR gets merged, the bug will be moved to VERIFIED by the bot automatically, if not working, I will move to VERIFIED manually.

Comment 6 errata-xmlrpc 2020-11-16 14:37:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.4 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4987