Bug 1461340

Summary: Upgrade from 3.5 to 3.6 missed access right to the 'Shared' project
Product: OpenShift Container Platform Reporter: ge liu <geliu>
Component: apiserver-authAssignee: Jordan Liggitt <jliggitt>
Status: CLOSED WORKSFORME QA Contact: Chuan Yu <chuyu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aos-bugs, geliu, jokerman, jupierce, mmccomas, sdodson
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-16 12:59:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ge liu 2017-06-14 09:05:13 UTC
Description of problem: 

In OCP 3.5 env, Create project with option: "Shared: Allow any authenticated user to pull images" in UI, login with any user, and could find the shared project created, then upgrade to 3.6, login with same user, and find the shared project missed.

openshift v3.6.106
kubernetes v1.6.1+5115d708d7
etcd 3.2.0

How reproducible:
Always

Steps to Reproduce:

1. login UI: https://registry-console-default.0614-tal.qe.rhcloud.com/registry#/projects, and create project with option: "Shared: Allow any authenticated user to pull images" 

2. login with user, could find the shared project(anliproject), 

3. upgrade 3.5 to 3.6

4. login with user, and find the shared project missed
# oc get project
NAME      DISPLAY NAME   STATUS
lgproj                   Active


Actual results:

Shared project could not be accessed after upgrade

Expected results:

Shared project could be accessed after upgrade

Comment 1 ge liu 2017-06-14 09:29:06 UTC
Clarify a couple of points in description above:

1). The shared project still exist after upgrade, admin and shared project owner could access to it, but other users could not access it.

2). checked the policybindings of shared project, and found the below items missed:
#  oc get policybinding -n anliproject -o json
.......................
................

{
                    "name": "registry-admin",
                    "roleBinding": {
                        "groupNames": null,
                        "metadata": {
                            "creationTimestamp": "2017-06-14T08:25:54Z",
                            "name": "registry-admin",
                            "namespace": "anliproject",
                            "resourceVersion": "2963",
                            "uid": "1544c621-50db-11e7-b2ed-0e8ce8792814"
                        },
                        "roleRef": {
                            "name": "registry-admin"
                        },
                        "subjects": [
                            {
                                "kind": "User",
                                "name": "anli"
                            }
                        ],
                        "userNames": [
                            "anli"
                        ]
                    }
                },
                {
                    "name": "registry-viewer",
                    "roleBinding": {
                        "groupNames": [
                            "system:authenticated"
                        ],
                        "metadata": {
                            "creationTimestamp": "2017-06-14T08:25:53Z",
                            "name": "registry-viewer",
                            "namespace": "anliproject",
                            "resourceVersion": "2943",
                            "uid": "14a10e75-50db-11e7-b2ed-0e8ce8792814"
                        },
                        "roleRef": {
                            "name": "registry-viewer"
                        },
                        "subjects": [
                            {
                                "kind": "SystemGroup",
                                "name": "system:authenticated"
                            }
                        ],
                        "userNames": null
                    }
                },
................

Comment 2 ge liu 2017-06-14 09:34:15 UTC
Post-upgrade:

# oc get policybinding -n anliprojoect
NAME       ROLE BINDINGS                                                          LAST MODIFIED
:default   admin, system:deployers, system:image-builders, system:image-pullers   2017-06-13 06:29:28 -0400 EDT


Don't upgrade:

#  oc get policybinding -n anliproject
NAME       ROLE BINDINGS                                                                                           LAST MODIFIED
:default   admin, registry-admin, registry-viewer, system:deployers, system:image-builders, system:image-pullers   2017-06-14 04:25:54 -0400 EDT

Comment 3 Jordan Liggitt 2017-06-14 18:05:23 UTC
nothing in the upgrade process removes rolebindings or policybindings. I'm not able to recreate this starting a 3.5 server, then a 3.6 server

Comment 4 Scott Dodson 2017-06-14 19:11:13 UTC
Justin is this similar to something you were looking into in a preview environment?

Comment 5 Justin Pierce 2017-06-14 21:32:54 UTC
It definitely sounds similar. In our 3.5 to 3.6 upgrade on free-int, the clusterolebinding created by:
oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:openshift-devops-monitor:prometheus

disappeared after the upgrade - according to Clayton.

Comment 6 Jordan Liggitt 2017-06-16 01:06:19 UTC
I still cannot recreate this. Reconcile of clusterroles and clusterrolebindings left user-created bindings intact as expected.

Did you delete/recreate the registry-admin and registry-viewer roles?

Comment 7 ge liu 2017-06-16 01:53:50 UTC
Jordan, this problem happened on below env, and the upgrade is from 3.5.5.24--> v3.6.106 on upgrade team setup env, and I tried another upgrade from 3.5.5.25--> v3.6.106, but could not reproduce it, of course, these two upgrade env have some difference in installation configuration(such as: multenanacy, LDAP etc.), but these configuration difference looks like will not result this problem, so could you pls login on it to find some some clue? 

Masters:
openshift-147.lab.sjc.redhat.com
Nodes:
openshift-102.lab.sjc.redhat.com
openshift-103.lab.sjc.redhat.com
openshift-104.lab.sjc.redhat.com
openshift-106.lab.sjc.redhat.com
openshift-127.lab.sjc.redhat.com

Comment 9 ge liu 2017-06-16 01:57:45 UTC
The shared project is: 'anliprojoect' and the project owner is: anli2, currently, only admin and owner user could access the shared project.

Comment 10 Jordan Liggitt 2017-06-16 02:22:28 UTC
the creation and modification timestamp of the project, rolebindings, and policybinding all match:

# oc get ns anliprojoect -o yaml | grep creationTimestamp
  creationTimestamp: 2017-06-13T10:29:28Z

# oc get policybinding -n anliprojoect -o yaml | grep creationTimestamp
    creationTimestamp: 2017-06-13T10:29:28Z
        creationTimestamp: 2017-06-13T10:29:28Z
        creationTimestamp: 2017-06-13T10:29:28Z
        creationTimestamp: 2017-06-13T10:29:28Z
        creationTimestamp: 2017-06-13T10:29:28Z

# oc get policybinding -n anliprojoect -o yaml | grep Modified
  lastModified: 2017-06-13T10:29:28Z


I don't see any evidence that the project or rolebindings were modified after the project was created, during an upgrade or otherwise.

Comment 11 ge liu 2017-06-16 08:39:21 UTC
ok, if there is not any useful clue, pls close this bug, we will reopen it if reproduce in future.thx