Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1468629

Summary: OriginRoleBindingToRBACRoleBindingController namespace not found messages on namespace deletion
Product: OpenShift Container Platform Reporter: Mike Fiedler <mifiedle>
Component: apiserver-authAssignee: Jordan Liggitt <jliggitt>
Status: CLOSED CURRENTRELEASE QA Contact: Hongkai Liu <hongkliu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aos-bugs, ccoleman, eparis, hongkliu, jliggitt, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-16 19:35:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mike Fiedler 2017-07-07 15:01:06 UTC
Description of problem:

In the 3.6 scalability cluster, every project deletion is spawning a burst of Error level messages in the master-controller logs.  Are these real errors:

Jul 07 10:55:08 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:08.347073   48797 namespace_controller.go:148] unexpected items still remain in namespace: mffiedler for gvr: { v1 pods}
Jul 07 10:55:09 172.16.0.8 atomic-openshift-master-controllers[48797]: I0707 10:55:09.271067   48797 vnids_master.go:144] Released netid 8547782 for namespace "mffiedler"
Jul 07 10:55:13 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:13.871714   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/admin failed with : namespaces "mffiedler" not found
Jul 07 10:55:13 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:13.889269   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/system:deployers failed with : namespaces "mffiedler" not found
Jul 07 10:55:13 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:13.890280   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/system:image-pullers failed with : namespaces "mffiedler" not found
Jul 07 10:55:13 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:13.890988   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/system:image-builders failed with : namespaces "mffiedler" not found
Jul 07 10:55:14 172.16.0.8 atomic-openshift-master-controllers[48797]: I0707 10:55:14.244678   48797 namespace_controller.go:169] Namespace has been deleted mffiedler
Jul 07 10:55:54 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:54.886637   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/admin failed with : namespaces "mffiedler" not found
Jul 07 10:55:54 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:54.904189   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/system:deployers failed with : namespaces "mffiedler" not found
Jul 07 10:55:54 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:54.905058   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/system:image-pullers failed with : namespaces "mffiedler" not found
Jul 07 10:55:54 172.16.0.8 atomic-openshift-master-controllers[48797]: E0707 10:55:54.905886   48797 generic.go:65] OriginRoleBindingToRBACRoleBindingController: mffiedler/system:image-builders failed with : namespaces "mffiedler" not found


Version-Release number of selected component (if applicable): 3.6.126.1


How reproducible: Always in this environment


Steps to Reproduce:
1. Delete a project
2. Check master controller logs
3.

Actual results:

Error messages as above

Expected results:

No errors for normal operations

Comment 1 Clayton Coleman 2017-07-07 18:18:47 UTC
It looks like the logging may be too aggressive here, but it could also be a logic bug.

Comment 2 Mike Fiedler 2017-07-07 18:40:26 UTC
Also seeing it at project creation time, but with reason "already exists"

Jul 07 14:35:53 172.16.0.16 atomic-openshift-master-controllers[87976]: E0707 14:35:53.069357   87976 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-2-12/system:image-pullers failed with : rolebindings.rbac.a
uthorization.k8s.io "system:image-pullers" already exists
Jul 07 14:35:53 172.16.0.16 atomic-openshift-master-controllers[87976]: E0707 14:35:53.076880   87976 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-2-12/system:image-builders failed with : rolebindings.rbac.
authorization.k8s.io "system:image-builders" already exists
Jul 07 14:35:53 172.16.0.16 atomic-openshift-master-controllers[87976]: E0707 14:35:53.077704   87976 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-2-12/system:deployers failed with : rolebindings.rbac.autho
rization.k8s.io "system:deployers" already exists

Comment 3 David Eads 2017-07-10 11:45:22 UTC
We haven't worried about these messages for other controllers.  Once the cache updates and indicates that the source role/rolebindings aren't present, it will stop trying to create them.

Comment 4 Jordan Liggitt 2017-07-11 16:00:40 UTC
fixed in https://github.com/openshift/origin/pull/15142

Comment 6 Hongkai Liu 2017-07-14 15:58:11 UTC
Still see those messages in 
root@ip-172-31-41-17: ~ # openshift version
openshift v3.6.144
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

===
root@ip-172-31-41-17: ~ # tail -f /var/log/messages | grep -E "F071|E071" | grep -v atomic-openshift-node | grep -i namespace | grep -i "not found"

Jul 14 11:33:02 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:02.385606   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk1/system:deployers failed with : namespaces "svt-hk1" not found
Jul 14 11:33:07 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:07.504728   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk1/admin failed with : namespaces "svt-hk1" not found
Jul 14 11:33:18 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:18.029720   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk7/admin failed with : namespaces "svt-hk7" not found
Jul 14 11:33:18 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:18.831764   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk5/system:image-pullers failed with : namespaces "svt-hk5" not found
Jul 14 11:33:19 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:19.240248   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk6/system:image-pullers failed with : namespaces "svt-hk6" not found
Jul 14 11:33:19 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:19.329791   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk7/system:deployers failed with : namespaces "svt-hk7" not found
Jul 14 11:33:19 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:19.981697   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk5/system:image-builders failed with : namespaces "svt-hk5" not found
Jul 14 11:33:20 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:20.782001   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk6/system:image-builders failed with : namespaces "svt-hk6" not found
Jul 14 11:33:21 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:21.381707   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk5/system:deployers failed with : namespaces "svt-hk5" not found
Jul 14 11:33:22 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:22.282038   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk6/system:deployers failed with : namespaces "svt-hk6" not found
Jul 14 11:33:22 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:22.481811   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk7/system:image-builders failed with : namespaces "svt-hk7" not found
Jul 14 11:33:22 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:22.581739   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk7/system:image-pullers failed with : namespaces "svt-hk7" not found
Jul 14 11:33:22 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:22.681464   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk5/admin failed with : namespaces "svt-hk5" not found
Jul 14 11:33:23 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:23.581163   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk6/admin failed with : namespaces "svt-hk6" not found
Jul 14 11:33:25 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:25.515024   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk8/admin failed with : namespaces "svt-hk8" not found
Jul 14 11:33:25 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:25.615096   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk8/system:deployers failed with : namespaces "svt-hk8" not found
Jul 14 11:33:25 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:25.715563   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk8/system:image-builders failed with : namespaces "svt-hk8" not found
Jul 14 11:33:25 ip-172-31-37-68 atomic-openshift-master: E0714 11:33:25.915404   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk8/system:image-pullers failed with : namespaces "svt-hk8" not found
Jul 14 11:37:59 ip-172-31-37-68 atomic-openshift-master: E0714 11:37:59.389882   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk1/system:image-pullers failed with : namespaces "svt-hk1" not found
Jul 14 11:38:19 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:19.884328   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk1/system:image-builders failed with : namespaces "svt-hk1" not found
Jul 14 11:38:30 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:30.119386   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk1/system:deployers failed with : namespaces "svt-hk1" not found
Jul 14 11:38:35 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:35.238747   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk1/admin failed with : namespaces "svt-hk1" not found
Jul 14 11:38:45 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:45.763568   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk7/admin failed with : namespaces "svt-hk7" not found
Jul 14 11:38:46 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:46.565609   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk5/system:image-pullers failed with : namespaces "svt-hk5" not found
Jul 14 11:38:46 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:46.974120   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk6/system:image-pullers failed with : namespaces "svt-hk6" not found
Jul 14 11:38:47 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:47.063558   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk7/system:deployers failed with : namespaces "svt-hk7" not found
Jul 14 11:38:47 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:47.715404   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk5/system:image-builders failed with : namespaces "svt-hk5" not found
Jul 14 11:38:48 ip-172-31-37-68 atomic-openshift-master: E0714 11:38:48.515573   18523 generic.go:65] OriginRoleBindingToRBACRoleBindingController: svt-hk6/system:image-builders failed with : namespaces "svt-hk6" not found

Comment 7 Jordan Liggitt 2017-07-14 16:06:02 UTC
can you double check the version of the server?

the log and the version command show different machines

Comment 8 Mike Fiedler 2017-07-14 17:17:32 UTC
Bad puddle contents?   From that system:

# yum list installed | grep openshift
atomic-openshift.x86_64         3.6.144-1.git.0.31aa217.el7
atomic-openshift-clients.x86_64 3.6.144-1.git.0.31aa217.el7
atomic-openshift-clients-redistributable.x86_64
atomic-openshift-docker-excluder.noarch
atomic-openshift-dockerregistry.x86_64
atomic-openshift-excluder.noarch
atomic-openshift-master.x86_64  3.6.144-1.git.0.31aa217.el7
atomic-openshift-node.x86_64    3.6.144-1.git.0.31aa217.el7
atomic-openshift-pod.x86_64     3.6.144-1.git.0.31aa217.el7
atomic-openshift-sdn-ovs.x86_64 3.6.144-1.git.0.31aa217.el7
atomic-openshift-tests.x86_64   3.6.144-1.git.0.31aa217.el7
atomic-openshift-utils.noarch   3.6.144-1.git.0.50e12bf.e

# openshift version
openshift v3.6.144
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

Comment 9 Mike Fiedler 2017-07-14 19:14:58 UTC
Re: comment 7 the ip in the log is from our gold AMI provisioner base system.   the actual machine is the one on the command.

Comment 10 Jordan Liggitt 2017-07-15 00:16:58 UTC
Found a second issue. If the namespace cleanup deleted the container object (e.g. policy, policybinding) first, the deletions were processed cleanly.

if the namespace cleanup deleted the virtual individual objects (e.g. role, rolebinding) first, we were not removing the nested objects correctly.

beyond namespace cleanup, this actually affected correctness of the controller in the following scenarios:

role deletion:
1. create origin role (virtual)
2. policy object gets updated to add the new role
3. controller observes policy update, creates rbac role
4. delete origin role (virtual)
5. policy object gets updated to remove the role
6. controller observes policy update, should delete the corresponding rbac role but misses processing the removed role

role-binding deletion:
1. create origin rolebinding (virtual)
2. policy object gets updated to add the new rolebinding
3. controller observes policy update, creates rbac rolebinding
4. delete origin rolebinding (virtual)
5. policy object gets updated to remove the rolebinding
6. controller observes policy update, should delete the corresponding rbac rolebinding but misses processing the removed rolebinding

Comment 11 Jordan Liggitt 2017-07-15 00:17:43 UTC
fixed in https://github.com/openshift/origin/pull/15223 and will add an integration test around this scenario

Comment 12 Jordan Liggitt 2017-07-18 17:42:08 UTC
fixed in 3.6.153-1

Comment 13 Hongkai Liu 2017-07-19 18:41:17 UTC
# openshift version
openshift v3.6.153
kubernetes v1.6.1+5115d708d7
etcd 3.2.1


===
# tail -f /var/log/messages | grep -E "F071|E071" | grep -v atomic-openshift-node | grep -i namespace | grep -i "not found"

Jul 19 14:24:39 ip-172-31-1-2 atomic-openshift-master: E0719 14:24:39.718847   36306 garbagecollector.go:167] Error syncing item &garbagecollector.node{identity:garbagecollector.objectReference{OwnerReference:v1.OwnerReference{APIVersion:"v1", Kind:"ReplicationController", Name:"deploymentconfig2v0-1", UID:"86953bcf-6cad-11e7-8de7-0251a88f5244", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil)}, Namespace:"c9"}, dependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, dependents:map[*garbagecollector.node]struct {}{(*garbagecollector.node)(0xc4397424e0):struct {}{}, (*garbagecollector.node)(0xc4395fe9c0):struct {}{}}, deletingDependents:false, deletingDependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, beingDeleted:false, beingDeletedLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, owners:[]v1.OwnerReference{v1.OwnerReference{APIVersion:"apps.openshift.io/v1", Kind:"DeploymentConfig", Name:"deploymentconfig2v0", UID:"86849a73-6cad-11e7-8de7-0251a88f5244", Controller:(*bool)(0xc422e25c88), BlockOwnerDeletion:(*bool)(0xc422e25c89)}}}: replicationcontrollers "deploymentconfig2v0-1" not found


===
No messages like the following pattern were found in the log:
*** failed with : namespaces "*" not found

I believe that the one caught in the log is with other pattern, not related to this bug.


Jordan, thanks for the fix.

Comment 15 Jordan Liggitt 2017-08-14 22:16:28 UTC
This was fixed in 3.6.0