Bug 1883502

Summary: Logging is broken due to mix of k8s.io/klog v1 and v2
Product: OpenShift Container Platform Reporter: Ricardo Maraschini <rmarasch>
Component: Image RegistryAssignee: Ricardo Maraschini <rmarasch>
Status: CLOSED ERRATA QA Contact: Wenjing Zheng <wzheng>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: aos-bugs, jhadvig, jokerman, obulatov, sttts, yapei
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Usage of two distinct versions of the same logging package. Consequence: Operator logs were being partially lost. Fix: Making logging package versions equal (upgraded logging package used by the operator to meet the one used by client-go) Result: Logs are not lost anymore.
Story Points: ---
Clone Of: 1883461 Environment:
Last Closed: 2021-02-24 15:21:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ricardo Maraschini 2020-09-29 12:18:18 UTC
+++ This bug was initially created as a clone of Bug #1883461 +++

k8s.io/klog moved to v2 in client-go and frieds. The operator does not use v2 yet and hence, we lose one half of the logging output (v2).

--- Additional comment from Stefan Schimanski on 2020-09-29 11:09:06 UTC ---

I did some tests using openshift-apiserver that uses kube component-base to add klog command line flags:

$ ./openshift-apiserver start --config asfd --v=4
I0929 13:04:18.354425   53628 cmd.go:57] v1 klog
I0929 13:04:18.354506   53628 cmd.go:58] v2 klog
I0929 13:04:18.354512   53628 cmd.go:61] v2 V(2) klog
I0929 13:04:18.354515   53628 cmd.go:64] v2 V(4) klog

So this means that we DO NOT set verbosity for v1 klog, and hence lose all the klogv1.V(2).Infof output.

Comment 1 Oleg Bulatov 2020-09-29 19:09:50 UTC
We didn't have ability to change log levels before, so it's not a blocker for 4.6.

Comment 2 Ricardo Maraschini 2020-10-02 07:39:31 UTC
A PR already exists for this one, awaiting review. Other than that, we still need to wait for 4.7 cycle to start.

Comment 7 Wenjing Zheng 2020-10-26 07:40:48 UTC
Thanks Ricardo and Oleg!
Verified on 4.7.0-0.nightly-2020-10-24-155529:
1. With a GCP cluster, change spec.operatorlevel=TraceAll;
2. Set the image registry to Removed
3. Wait for some seconds, log like below will appear:
$ oc logs pods/cluster-image-registry-operator-5b6f5ff8b-rwwgp | grep deleting
I1026 07:32:48.770962       1 gcs.go:327] deleting all objects in bucket wzheng-47-2tfrc-image-registry-us-central1-fcuhhqklefmcoryijnt

Comment 10 errata-xmlrpc 2021-02-24 15:21:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633