Bug 1902456 - Improvement of log error messages in builds
Summary: Improvement of log error messages in builds
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry
Version: 4.8
Hardware: All
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Oleg Bulatov
QA Contact: wewang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-29 09:27 UTC by David Hernández Fernández
Modified: 2023-03-14 10:16 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: error handling considers "access denied" error only as "authentication required". Consequence: bug causes incorrect error log, that may lead to confusions. Fix: through docker distribution error handling changed error output from "authentication required" to "access denied" Result: "access denied" error gives more precise error logs
Clone Of:
Environment:
Last Closed: 2023-03-09 01:00:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift image-registry pull 291 0 None None None 2021-08-26 10:08:48 UTC

Description David Hernández Fernández 2020-11-29 09:27:52 UTC
Description of problem:

Example 1: S2I build in project A from image in project B

This requires a RoleBinding in project B on system:image-puller to allow Group system:serviceaccounts:A. If this is missing or incorrect you get the following error when building:

Cloning "ssh://git/example.git" ...
	Commit:	a42782afc295e8ab019728ccc998bf0c3f4a2e74 (Test)
	Author:	Dave <dave>
	Date:	Fri Oct 16 10:58:31 2020 +0100
Caching blobs under "/var/cache/blobs".
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
error: build error: After retrying 2 times, Pull image still failed due to error: unauthorized: authentication required

This issue is nothing about authentication - the serviceaccount builder in project A has successfully authenticated to the registry, but it not authorised to pull the S2I builder image. I'd suggest a much more useful error would be "system:serviceaccounts:A:builder does not have permission to pull image-mage-registry.openshift-image-registry.svc:5000/B/s2i-builder-xyz"


For reproducer:

apiVersion: v1
kind: Template
metadata:
  name: test-a
parameters:
objects:
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: "system:image-pullers"
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: "system:image-puller"
    subjects:
      - apiGroup: rbac.authorization.k8s.io
        kind: Group
        name: "system:serviceaccounts:test-b"

  - apiVersion: image.openshift.io/v1
    kind: ImageStream
    metadata:
      name: python
  - apiVersion: build.openshift.io/v1
    kind: BuildConfig
    metadata:
      name: python-36
    spec:
      failedBuildsHistoryLimit: 1
      successfulBuildsHistoryLimit: 1
      output:
        to:
          kind: ImageStreamTag
          name: python:3.6
      runPolicy: Serial
      source:
        type: Dockerfile
        dockerfile: |
          FROM ignored
          RUN echo hello > /tmp/test
      strategy:
        dockerStrategy:
          from:
            kind: ImageStreamTag
            namespace: openshift
            name: python:3.6


Example 2: DeploymentConfig in project A references image in project B

eg. spec.template.spec.containers[].image: image-registry.openshift-image-registry.svc:5000/B/xyz:latest

Pod status ends up in ImagePullBackOff / ErrImagePull with an error in the Events tab:

Failed to pull image "image-registry.openshift-image-registry.svc:5000/B/xyz:latest": rpc error: code = Unknown desc = Error reading manifest latest in image-registry.openshift-image-registry.svc:5000/B/xyz: unauthorized: authentication required

For reproducer.
apiVersion: v1
kind: Template
metadata:
  name: test-b
parameters:
objects:
  # ----------- IMAGE STREAM --------------------
  - apiVersion: image.openshift.io/v1
    kind: ImageStream
    metadata:
      name: django-ex

  # ----------- BUILD --------------------
  - apiVersion: build.openshift.io/v1
    kind: BuildConfig
    metadata:
      name: django-ex
    spec:
      failedBuildsHistoryLimit: 1
      successfulBuildsHistoryLimit: 1
      output:
        to:
          kind: ImageStreamTag
          name: "django-ex:latest"
      resources:
        requests:
          cpu: 100m
          memory: 100Mi
        limits:
          cpu: 2
          memory: 2Gi
      source:
        git:
          uri: "https://github.com/sclorg/django-ex.git"
        type: Git
      strategy:
        sourceStrategy:
          from:
            kind: ImageStreamTag
            name: "python:3.6"
            namespace: test-a

  - apiVersion: v1
    kind: DeploymentConfig
    metadata:
      name: python-example
    spec:
      replicas: 1
      selector:
        app: python-example
        deploymentconfig: python-example
      template:
        metadata:
          labels:
            app: python-example
            deploymentconfig: python-example
        spec:
          containers:
            - image: image-registry.openshift-image-registry.svc:5000/test-a/python:3.6
              imagePullPolicy: Always
              name: python-example
              command: [ "/bin/bash", "-c", "sleep infinity" ]
              resources:
                requests:
                  cpu: 100m
                  memory: 100Mi
                limits:


Attached two templates, one for a project called test-a and another called test-b. If you apply them, you'll be able to run both builds and roll out the deploymentconfig. However if you then remove the system:image-puller rolebinding in test-a allowing test-b, then try rebuilding the build in test-b and rolling out the deploymentconfig in test-b you'll get the error messages discussed above.

Actual results: Incorrect log message errors or not very accurate for troubleshooting.

Expected results: Better logs.

Comment 1 Adam Kaplan 2020-12-01 17:00:51 UTC
This can be a bit challenging to improve for builds. The error message reported here comes from buildah, which has no special knowledge about the OpenShift image registry. For the internal registry it uses pull secret from the build controller, and pulls from the internal registry just like any other OCI/Docker image registry.

CC-ing Oleg - is "unauthorized: authentication required" coming from the image registry HTTP response? Is this something we can improve?

Comment 3 David Hernández Fernández 2020-12-14 16:52:00 UTC
I was just checking some options, hence the lowest priority is set. If it can be improved it would help the user experience and checks of builds set up, but if it is relying on such external factors and it's too much complicated or time-consuming, just let us know.

Comment 9 Oleg Bulatov 2021-03-19 11:46:11 UTC
That's how Docker Distribution works: first the client gets a token with a desired scope, then it uses it. If your credentials are not enough to get such a token, then you get Unauthorized [1].

The Docker client has additional logic how to handle these errors [2], we need to further investigate ability to use it.

[1]: https://github.com/distribution/distribution/blob/a01c71e2477eea211bbb83166061e103e0b2ec95/registry/handlers/app.go#L844-L851
[2]: https://github.com/distribution/distribution/blob/a01c71e2477eea211bbb83166061e103e0b2ec95/registry/client/errors.go#L110-L113

Comment 14 wewang 2021-09-22 02:31:18 UTC
it can be reproduced: 
[wewang@localhost ~]$ oc logs -f build/django-ex-2
Cloning "https://github.com/sclorg/django-ex.git" ...
	Commit:	7cbc59619cb3ad23d32a06a398592da3eb34388c (Merge pull request #181 from sclorg/dependabot/pip/django-debug-toolbar-1.11.1)
	Author:	Lumír 'Frenzy' Balhar <lbalhar>
	Date:	Mon Apr 19 08:01:14 2021 +0200
time="2021-09-22T02:17:08Z" level=info msg="Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled"
I0922 02:17:08.905212       1 defaults.go:102] Defaulting to storage driver "overlay" with options [mountopt=metacopy=on].
Caching blobs under "/var/cache/blobs".
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
error: build error: After retrying 2 times, Pull image still failed due to error: unauthorized: authentication required

Comment 15 wewang 2021-09-22 08:08:17 UTC
PR is in wip status

Comment 16 wewang 2021-09-28 02:25:45 UTC
Verified in version:
4.9.0-0.ci.test-2021-09-28-013422-ci-ln-n8rjjs2-latest
[wewang@localhost ~]$ oc logs -f build/django-ex-1
Cloning "https://github.com/sclorg/django-ex.git" ...
	Commit:	7cbc59619cb3ad23d32a06a398592da3eb34388c (Merge pull request #181 from sclorg/dependabot/pip/django-debug-toolbar-1.11.1)
	Author:	Lumír 'Frenzy' Balhar <lbalhar>
	Date:	Mon Apr 19 08:01:14 2021 +0200
time="2021-09-28T02:21:17Z" level=info msg="Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled"
I0928 02:21:17.046017       1 defaults.go:102] Defaulting to storage driver "overlay" with options [mountopt=metacopy=on].
Caching blobs under "/var/cache/blobs".
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
error: build error: After retrying 2 times, Pull image still failed due to error: errors:
denied: requested access to the resource is denied

Comment 22 W. Trevor King 2022-03-03 21:07:17 UTC
[1] introduced a regression, and it was reverted in master/4.11 in bug 2060605 , with a revert in flight for 4.10 in bug 2060610.  Moving this back to ASSIGNED so you can take another run at it.

[1]: https://github.com/openshift/image-registry/pull/291

Comment 29 Shiftzilla 2023-03-09 01:00:18 UTC
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira.

https://issues.redhat.com/browse/OCPBUGS-8821


Note You need to log in before you can comment on or make changes to this bug.