Bug 1500519

Summary: Logs are flooded with "unauthorized: authentication required" errors
Product: OpenShift Container Platform Reporter: Vikas Laad <vlaad>
Component: BuildAssignee: Jim Minter <jminter>
Status: CLOSED ERRATA QA Contact: Vikas Laad <vlaad>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.7.0CC: aos-bugs, jupierce, mfojtik, mifiedle, mkhan, tparsons, vlaad
Target Milestone: ---Keywords: Reopened
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-28 22:15:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
master logs
none
logs
none
master logs v153 none

Description Vikas Laad 2017-10-10 19:55:21 UTC
Description of problem:
Logs are flooded with following errors, just creating one cakephp quickstart app shows following errors in logs. This is a system container install of openshift.

Oct 10 19:50:45 ip-172-31-16-242.us-west-2.compute.internal atomic-openshift-master-api[8065]: I1010 19:50:45.338502    8085 rbac.go:116] RBAC DENY: user "system:serviceaccount:vlaad-test:builder" groups ["system:serviceaccounts" 
"system:serviceaccounts:vlaad-test" "system:authenticated"] cannot "get" resource "imagestreams/layers.image.openshift.io" named "django-psql-example" in namespace "svt-proj-49"
Oct 10 19:50:45 ip-172-31-16-242.us-west-2.compute.internal atomic-openshift-master-api[8065]: I1010 19:50:45.506542    8085 rbac.go:116] RBAC DENY: user "system:serviceaccount:vlaad-test:builder" groups ["system:serviceaccounts" 
"system:serviceaccounts:vlaad-test" "system:authenticated"] cannot "get" resource "imagestreams/layers.image.openshift.io" named "django-psql-example" in namespace "svt-proj-43"
Oct 10 19:50:45 ip-172-31-16-242.us-west-2.compute.internal atomic-openshift-master-api[8065]: I1010 19:50:45.592156    8085 rbac.go:116] RBAC DENY: user "system:serviceaccount:vlaad-test:builder" groups ["system:serviceaccounts" 
"system:serviceaccounts:vlaad-test" "system:authenticated"] cannot "delete" resource "images.image.openshift.io" cluster-wide
Oct 10 19:50:45 ip-172-31-16-242.us-west-2.compute.internal atomic-openshift-master-api[8065]: I1010 19:50:45.644750    8085 rbac.go:116] RBAC DENY: user "system:serviceaccount:vlaad-test:builder" groups ["system:serviceaccounts" 
"system:serviceaccounts:vlaad-test" "system:authenticated"] cannot "get" resource "imagestreams/layers.image.openshift.io" named "django-psql-example" in namespace "svt-proj-42"
Oct 10 19:50:45 ip-172-31-16-242.us-west-2.compute.internal atomic-openshift-master-api[8065]: I1010 19:50:45.696184    8085 rbac.go:116] RBAC DENY: user "system:serviceaccount:vlaad-test:builder" groups ["system:serviceaccounts" 
"system:serviceaccounts:vlaad-test" "system:authenticated"] cannot "delete" resource "images.image.openshift.io" cluster-wide


Version-Release number of selected component (if applicable):
openshift v3.7.0-0.143.1
kubernetes v1.7.0+80709908fd
etcd 3.2.1

How reproducible:
Always

Steps to Reproduce:
1. oc new-project vlaad-test
2. oc new-app --template=cakephp-mysql-example
3. see logs

Actual results:
Logs are filled with errors

Expected results:
No errors in logs

Additional info:
Master logs attached for this project.

Comment 1 Vikas Laad 2017-10-10 19:57:14 UTC
Created attachment 1336890 [details]
master logs

Comment 2 Ben Parees 2017-10-10 20:00:16 UTC
did the build succeed? is the app working?

Comment 3 Vikas Laad 2017-10-10 20:01:22 UTC
Yes, build succeeded and app is working.

Comment 4 Ben Parees 2017-10-10 20:01:58 UTC
and you did nothing else on this cluster except run new-app?

Comment 5 Vikas Laad 2017-10-10 20:04:38 UTC
No I did nothing else. I was also able to re-produce this on another cluster too. Trying on rpm install now.

Comment 6 Ben Parees 2017-10-10 20:09:21 UTC
the builder service account would normally never be used to remove images, so it certainly seems like something is attempting to run a prune using the builder service account.

Comment 7 Ben Parees 2017-10-10 20:10:29 UTC
and I do not see these messages on my origin cluster.

Comment 8 Mo 2017-10-10 20:52:35 UTC
The logs point to some strange errors from pkg/apps/controller/generictrigger/deploy_generictrigger_controller.go which Michal deleted in https://github.com/openshift/origin/pull/14910

Michal can you take a look to see why we may have delete calls from the builder SA?

Comment 9 Vikas Laad 2017-10-11 18:54:56 UTC
I am closing this bz as I dont see the problem when creating cluster with exact system container image tag v3.7.0-0.143.2 I was trying to use v3.7.0 before.

Looks like v3.7.0 is not latest image, I created another bz for this issue
https://bugzilla.redhat.com/show_bug.cgi?id=1500928

Comment 10 Vikas Laad 2017-10-12 14:00:15 UTC
I am sorry about the confusion, I am seeing these errors with the right image also v3.7.0-0.143.2 attaching logs from the new cluster where right system container images are used.

Comment 11 Vikas Laad 2017-10-12 14:00:37 UTC
Created attachment 1337804 [details]
logs

Comment 12 Ben Parees 2017-10-12 14:05:12 UTC
Vikas, do these errors only start to appear once you've created the build?  Or do they appear as soon as you create the project?

Comment 13 Vikas Laad 2017-10-12 14:16:20 UTC
It happens only after I create app/build.

Comment 14 Ben Parees 2017-10-12 15:09:06 UTC
the presence of these entries:
Oct 11 20:02:43 ip-172-31-11-89.us-west-2.compute.internal atomic-openshift-master-controllers[18101]: I1011 20:02:43.197604   18113 deploy_generictrigger_controller.go:73] Error instantiating deployment config vlaad/cakephp-mysql-example: cannot trigger a deployment for "cakephp-mysql-example" because it contains unresolved images

means this is from before mfojtik deleted that file.  So it's not running current master.  

Can you move up to the latest release before we dig into this further?

Comment 15 Vikas Laad 2017-10-12 15:14:12 UTC
I am running following right now

openshift v3.7.0-0.143.2
kubernetes v1.7.0+80709908fd
etcd 3.2.1

Attachment in comment #11 has logs from v3.7.0-0.143.2 cluster.

Comment 16 Ben Parees 2017-10-12 15:22:58 UTC
v3.7.0-0.143.2 still has the deploy_generictrigger_controller.go file.

ose appears to be up to v3.7.0-0.149.0

Comment 20 Ben Parees 2017-10-13 14:52:11 UTC
Moving this to modified as we are awaiting a puddle that can be used to verify if the issue still exists.

Note that my current theory is that the RBAC issues are coming from the imagechangecontroller trying to determine if a build should be run for a buildconfig, though i didn't think the imagechangecontroller ran as the builder service account anymore, so i'm still a little puzzled.  Also if that is what it's coming from, those requests should not be denied, since presumably that's how it's always worked, which would imply that something about the RBAC rules is not setup correctly.

Comment 22 Vikas Laad 2017-10-16 15:44:01 UTC
I deployed following version and I still see the error messages. Please let me know if I need to do anything else.

openshift v3.7.0-0.153.0
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8

Errors
Oct 16 15:36:48 ip-172-31-59-106.us-west-2.compute.internal atomic-openshift-master-api[1478]: I1016 15:36:48.381796    1491 rbac.go:116] RBAC DENY: user "system:serviceaccount:openshift-infra:build-controller" groups ["system:serviceaccounts" "system:serviceaccounts:openshift-infra" "system:authenticated"] cannot "create" resource "podsecuritypolicysubjectreviews.security.openshift.io" in namespace "vlaad"

Comment 23 Ben Parees 2017-10-16 16:09:22 UTC
>  RBAC DENY: user "system:serviceaccount:openshift-infra:build-controller" groups ["system:serviceaccounts" "system:serviceaccounts:openshift-infra" "system:authenticated"] cannot "create" resource "podsecuritypolicysubjectreviews.security.openshift.io" in namespace "vlaad"

this is a different denial and is likely related to the check that s2i builds do to see if you can run pods as root.  Not sure why the build-controller would be denied permission to do the check, or why being denied the check wouldn't fail the build, however.  

Is the build failing?

Comment 24 Vikas Laad 2017-10-16 16:36:59 UTC
Created attachment 1339389 [details]
master logs v153

Here are master logs grep for the project name, Build is not failing I just see these errors in logs.

Comment 25 Ben Parees 2017-10-16 20:50:44 UTC
Ok well the new message is i think unrelated to your original problems, but i believe it was introduced in: https://github.com/openshift/origin/pull/16647

Looks like we need to grant an additional permission to the builder SA.

Also we seem to have a lack of test coverage here because this bug would mean no one could ever run a build as root (even if their SCC allowed it).

So we need to add a TC to cover ensuring we actually allow users w/ appropriate SCCs to run as root (I know we have tests today that ensure you get denied).


(I should also note i can't actually recreate that RBAC DENY message in my own cluster, i'm not sure what to make of that, it may be worth enlisting Mo to at least confirm my suspicion).

Comment 26 Jim Minter 2017-10-16 23:37:21 UTC
Recreated -- needs loglevel >= 2 for the message to show.  Will fix and add test...

Comment 27 Jim Minter 2017-10-17 00:27:46 UTC
https://github.com/openshift/origin/pull/16898 (doesn't include tests yet)

Comment 28 Vikas Laad 2017-10-26 14:03:18 UTC
verified on openshift v3.7.0-0.178.0 version, dont see following errors anymore

RBAC DENY: user "system:serviceaccount:openshift-infra:build-controller" groups ["system:serviceaccounts" "system:serviceaccounts:openshift-infra" "system:authenticated"] cannot "create" resource "podsecuritypolicysubjectreviews.security.openshift.io" in namespace "vlaad"

Comment 31 errata-xmlrpc 2017-11-28 22:15:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188