Bug 1567658
Summary: | The default image registry log level should not be :debug | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> |
Component: | Image Registry | Assignee: | Alexey Gladkov <agladkov> |
Status: | CLOSED ERRATA | QA Contact: | Wenjing Zheng <wzheng> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.9.0 | CC: | agladkov, aos-bugs, bparees, ccoleman, lmeyer, obulatov, rcook, wzheng |
Target Milestone: | --- | ||
Target Release: | 3.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: |
Feature: Reduce the number of logs.
Reason: Registry generates 2GB logs per day.
Result: Log level changed to "info".
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-11 07:19:09 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Clayton Coleman
2018-04-15 21:14:45 UTC
Submitted pull request https://github.com/openshift/image-registry/pull/75 Copying the PR discussion to this bug: bparees: before we do this i'd love to know if there's a middle ground alternative because we have to debug registry push failures a lot (often not because of an actual issue in the registry that we could fix, but rather permission issues) and the debug info is super helpful when that happens. forcing people to redeploy their registry and recreate the failure so we can get logs is going to be painful. -------------------------------------------------------------------------------- ccoleman: Can we bump permission issues up a level so they show by default? We had 21Gb of logs from 10 days of runs - that’s high enough that I’d expect us to cause disk pressure in a lot of environments, which raises the possibility of other failures. -------------------------------------------------------------------------------- obulatov: I like the idea, but at first we need to fix how we log our errors. And unfortunately the upstream code needs to be fixed too. Otherwise we will not be able to explain the next "blob unknown to registry" error (did we tried to pull the blob from another registry? which one? and what about the local storage? etc.). The worst part: usually it's not the registry problem, but we still need to explain it. -------------------------------------------------------------------------------- agladov: All that is now in the log is necessary for investigations of various kinds of failures. Problems with the pullthrough, storage backend, authentication require the logging of different entities. In the current situation, all debug messages must be info, but all of them should remain in the log. If there are problems with the disk space or expiration, then we should think about using the log server (syslog). NAK -------------------------------------------------------------------------------- bparees: I'm guessing we don't have this level of control due to the docker distribution code, but it seems like a better solution would buffer the log messages and then only log the output if the overall request failed. Is such a thing possible for us here? -------------------------------------------------------------------------------- I would follow up on Alexey's suggestion... can we not get log rolling/expiration out of our underlying logging infrastructure? *** Bug 1383816 has been marked as a duplicate of this bug. *** current plan is to implement specific error logging in our registry code that rolls up all the failure information/context needed. being tracked in trello: https://trello.com/c/JmvVZj5L Handing off to Alexey since he's working the trello card. I can still see some debug info for successful push and failed push as below with v3.11.0-0.24.0:
time="2018-08-29T03:05:56.57009132Z" level=debug msg="authorizing request" go.version=go1.9.4 http.request.host="docker-registry.default.svc:5000" http.request.id=048e45db-8725-4e96-916c-77bd5cbaab80 http.request.method=HEAD http.request.remoteaddr="10.128.0.1:42438" http.request.uri="/v2/wzheng3/myimage/blobs/sha256:8c5a7da1afbc602695fcb2cd6445743cec5ff32053ea589ea9bd8773b7068185" http.request.useragent="docker/1.13.1 go/go1.9.2 kernel/3.10.0-862.11.6.el7.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 \(linux\))" instance.id=6042c925-1c7e-470d-9a17-e4300616d3a8 vars.digest="sha256:8c5a7da1afbc602695fcb2cd6445743cec5ff32053ea589ea9bd8773b7068185" vars.name=wzheng3/myimage
>>>>>>>>>>>>>>>
time="2018-08-29T03:08:10.815634543Z" level=debug msg="authorizing request" go.version=go1.9.4 http.request.host="docker-registry.default.svc:5000" http.request.id=94300499-4323-4c6f-89cd-12b654c14bd0 http.request.method=GET http.request.remoteaddr="10.128.0.1:42828" http.request.uri=/v2/wzheng3/wonderful/manifests/latest http.request.useragent="docker/1.13.1 go/go1.9.2 kernel/3.10.0-862.11.6.el7.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 \(linux\))" instance.id=6042c925-1c7e-470d-9a17-e4300616d3a8 vars.name=wzheng3/wonderful vars.reference=latest
time="2018-08-29T03:08:10.823906181Z" level=debug msg="Origin auth: checking for access to repository:wzheng3/wonderful:pull" go.version=go1.9.4 http.request.host="docker-registry.default.svc:5000" http.request.id=94300499-4323-4c6f-89cd-12b654c14bd0 http.request.method=GET http.request.remoteaddr="10.128.0.1:42828" http.request.uri=/v2/wzheng3/wonderful/manifests/latest http.request.useragent="docker/1.13.1 go/go1.9.2 kernel/3.10.0-862.11.6.el7.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 \(linux\))" instance.id=6042c925-1c7e-470d-9a17-e4300616d3a8 openshift.auth.user=wzheng openshift.auth.userid=39597686-ab37-11e8-a078-fa163e27bb7b vars.name=wzheng3/wonderful vars.reference=latest
Who you deployed image-registry ? Using ansible, oc cluster up or just upgrade existing installation ? *** Bug 1623577 has been marked as a duplicate of this bug. *** according to this issue, ansible is still setting it to debug: https://github.com/openshift/openshift-ansible/issues/9820 fix to set the log level https://github.com/openshift/openshift-ansible/pull/9821 (In reply to Alexey Gladkov from comment #11) > Who you deployed image-registry ? Using ansible, oc cluster up or just > upgrade existing installation ? Like Ben's comment 13, I am using ansible. Ryan's PR merged. Thanks Ryan! Verified with below version: openshift-ansible-3.11.0-0.28.0.git.0.730d4be.el7.noarch.rpm Log level is not debug by default now: time="2018-09-06T06:44:51.405801723Z" level=error msg="response completed with error" err.code="name unknown" err.message="repository name not known to registry" go.version=go1.9.4 http.request.host="docker-registry.default.svc:5000" http.request.id=543af9e7-1890-4468-a9ab-7c1f5e71d80b http.request.method=GET http.request.remoteaddr="10.2.4.1:38032" http.request.uri=/v2/wzheng1/123/manifests/latest http.request.useragent="docker/1.13.1 go/go1.9.2 kernel/3.10.0-862.11.6.el7.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 \(linux\))" http.response.contenttype="application/json; charset=utf-8" http.response.duration=23.75832ms http.response.status=404 http.response.written=87 instance.id=62a18abb-5999-4ddf-a1b6-db20c1b30b17 openshift.auth.user=wzheng openshift.auth.userid=0a38c0d4-b19d-11e8-a3b1-0e9bc0025290 vars.name=wzheng1/123 vars.reference=latest time="2018-09-06T06:44:51.405898872Z" level=info msg=response go.version=go1.9.4 http.request.host="docker-registry.default.svc:5000" http.request.id=33ecc63c-80d7-4d4f-a250-49e7d1be4af6 http.request.method=GET http.request.remoteaddr="10.2.4.1:38032" http.request.uri=/v2/wzheng1/123/manifests/latest http.request.useragent="docker/1.13.1 go/go1.9.2 kernel/3.10.0-862.11.6.el7.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 \(linux\))" http.response.contenttype="application/json; charset=utf-8" http.response.duration=23.896237ms http.response.status=404 http.response.written=87 instance.id=62a18abb-5999-4ddf-a1b6-db20c1b30b17 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652 |