Bug 1576527

Summary: Task fails when not authenticated as admin.
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: liujia <jiajliu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.7.1CC: aos-bugs, jokerman, mmccomas, sdodson
Target Milestone: ---   
Target Release: 3.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Certain upgrade tasks used the default kubeconfig which may have been updated by the admin in such a way that it prevented upgrade success. The upgrade playbooks now use an admin specific kubeconfig which is not prone to being altered. This insures proper upgrade process.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-27 07:59:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan Howe 2018-05-09 16:39:25 UTC
Description of problem:

Version-Release number of the following components:
atomic-openshift-utils  3.7.42

How reproducible:
100%

Steps to Reproduce:
1. As ansible ssh user login as not admin user
2. Run ansible 3.7 upgrade


Actual results:
TASK [Confirm OpenShift authorization objects are in sync] ****************************************************************************************************************************************************
FAILED - RETRYING: Confirm OpenShift authorization objects are in sync (2 retries left).
FAILED - RETRYING: Confirm OpenShift authorization objects are in sync (1 retries left).
fatal: [master1]: FAILED! => {"attempts": 2, "changed": false, "cmd": ["oc", "adm", "migrate", "authorization"], "delta": "0:00:00.304060", "end": "2018-05-08 11:11:33.862479", "msg": "non-zero return code", "rc": 1, "start": "2018-05-08 11:11:33.558419", "stderr": "error: You must be logged in to the server (the server has asked for the client to provide credentials)", "stderr_lines": ["error: You must be logged in to the server (the server has asked for the client to provide credentials)"], "stdout": "", "stdout_lines": []}
 [WARNING]: Could not create retry file '/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_7/upgrade.retry'.         [Errno 13] Permission denied: u'/usr/share/ansible
/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_7/upgrade.retry'


Expected results:

Success 


Additional info:

  https://github.com/openshift/openshift-ansible/blob/release-3.7/playbooks/common/openshift-cluster/upgrades/v3_7/validator.yml#L17

Should look like this: 

    command: >
      {{ openshift.common.client_binary }} adm --config /etc/origin/master/admin.kubeconfig migrate authorization

Comment 2 Scott Dodson 2018-05-18 13:39:20 UTC
In openshift-ansible-3.10.0-0.40.0 and later

Comment 3 liujia 2018-05-29 09:44:16 UTC
Tried several scenarios to re-produce the bug on openshift-ansible-3.7.46-1.git.0.37f607e.el7.noarch(did not include the fix), can not reproduce.

Scenario 1:
1. Run upgrade against ocp v3.6 to v3.7 with non-root user.
ansible_user=cloud-user
ansible_become=yes
2. Upgrade succeed.
TASK [Confirm OpenShift authorization objects are in sync] 
ok: [x.x.x.x] => {
    "attempts": 1,
    "changed": false,
    "cmd": [
        "oc",
        "adm",
        "migrate",
        "authorization"
    ],
    "delta": "0:00:00.728343",
    "end": "2018-05-29 01:39:43.603039",
    "failed": false,
    "invocation": {
        "module_args": {
            "_raw_params": "oc adm migrate authorization",
            "_uses_shell": false,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "warn": true
        }
    },
    "rc": 0,
    "start": "2018-05-29 01:39:42.874696",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "summary: total=205 errors=0 ignored=0 unchanged=205 migrated=0",
    "stdout_lines": [
        "summary: total=205 errors=0 ignored=0 unchanged=205 migrated=0"
    ]
Scenario 2:
1. "oc login" with cloud-user and keep login status
# oc whoami
cloud-user

2. Run upgrade against ocp v3.6 to v3.7 with non-root user.
ansible_user=cloud-user
ansible_become=yes
3. Upgrade succeed.

Scenario 3:
1. "oc login" with cloud-user and wait for login invalid(token expired).
# oc whoami
error: You must be logged in to the server (the server has asked for the client to provide credentials (get users.user.openshift.io ~))

2. Run upgrade against ocp v3.6 to v3.7 with non-root user.
ansible_user=cloud-user
ansible_become=yes

3. Upgrade succeed.
TASK [Confirm OpenShift authorization objects are in sync] ******************************************************************************************************************
ok: [x.x.x.x] => {"attempts": 1, "changed": false, "cmd": ["oc", "adm", "migrate", "authorization"], "delta": "0:00:00.769859", "end": "2018-05-29 05:26:35.341829", "failed": false, "rc": 0, "start": "2018-05-29 05:26:34.571970", "stderr": "", "stderr_lines": [], "stdout": "summary: total=201 errors=0 ignored=0 unchanged=201 migrated=0", "stdout_lines": ["summary: total=201 errors=0 ignored=0 unchanged=201 migrated=0"]}

@Scott
Do you know how to re-produce the issue? or should I just check the pr was merged into latest v3.10 installer?

Comment 4 Scott Dodson 2018-05-31 14:54:17 UTC
I think because you've set ansible_become=yes the changes you're making to cloud-user are irrelevant. You'd need to alter the login of the root user because ansible is going to execute all commands as root.

Comment 5 liujia 2018-06-04 07:35:51 UTC
Reproduced on openshift-ansible-3.7.46-1.git.0.37f607e.el7.noarch

1. Install ocp v3.6
2. Run "oc login" with non admin user on master hosts(ssh with root) to change default ~/.kube/config(ensure this file is not the same with /etc/origin/master/admin.kubeconfig).
3. Run upgrade against above ocp

Comment 6 liujia 2018-06-04 10:01:23 UTC
No [Confirm OpenShift authorization objects are in sync] in v3.10, and for this bug, should be fixed in pr https://github.com/openshift/openshift-ansible/pull/8499/. Changed targeted version to v3.7.

Verified on openshift-ansible-3.7.51-1.git.0.f9b681c.el7.noarch

TASK [Confirm OpenShift authorization objects are in sync] ******************************************************************************************************************
ok: [x] => {"attempts": 1, "changed": false, "cmd": ["oc", "adm", "migrate", "authorization", "--config=/etc/origin/master/admin.kubeconfig"], "delta": "0:00:00.782449", "end": "2018-06-04 05:49:46.540957", "failed": false, "rc": 0, "start": "2018-06-04 05:49:45.758508", "stderr": "", "stderr_lines": [], "stdout": "summary: total=201 errors=0 ignored=0 unchanged=201 migrated=0", "stdout_lines": ["summary: total=201 errors=0 ignored=0 unchanged=201 migrated=0"]}

Comment 7 liujia 2018-06-04 10:04:27 UTC
Added case OCP-18479

Comment 9 errata-xmlrpc 2018-06-27 07:59:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2009