Bug 1722483

Summary: [3.10] upgrade failed at TASK [ansible_service_broker : Create the Broker resource in the catalog]
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: Cluster Version OperatorAssignee: Joseph Callen <jcallen>
Status: CLOSED ERRATA QA Contact: Weihua Meng <wmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, jcallen, jokerman, mmccomas, shurley
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: When openshift facts was recently modified the ipv4 dictionary item no longer existed Consequence: MTU was set incorrectly Fix: Remove the conditional with ipv4 Result: MTU set correctly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-24 13:47:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Weihua Meng 2019-06-20 13:00:46 UTC
Description of problem:
[3.10] upgrade failed at TASK [ansible_service_broker : Create the Broker resource in the catalog]

Version-Release number of the following components:
openshift-ansible-3.10.149-1.git.0.eb0262c.el7

How reproducible:
N/A

Steps to Reproduce:
1. Upgrade OCP v3.9.78 to v3.10.149


Actual results:
upgrade failed.
TASK [ansible_service_broker : Create the Broker resource in the catalog] ******
task path: /home/slave1/workspace/Run-Ansible-Playbooks-Nextge/private-openshift-ansible/roles/ansible_service_broker/tasks/install.yml:217

    "msg": {
        "cmd": "/usr/bin/oc get ClusterServiceBroker ansible-service-broker -o json -n default", 
        "results": [
            {}
        ], 
        "returncode": 1, 
        "stderr": "Error from server (ServiceUnavailable): the server is currently unable to handle the request (get clusterservicebrokers.servicecatalog.k8s.io ansible-service-broker)\n", 
        "stdout": ""
    }
}

Expected results:
upgrade success.

Comment 2 Weihua Meng 2019-06-20 13:13:50 UTC
1. failed upgrade.
Log: https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Run-Ansible-Playbooks-Nextge/325/consoleFull

the result is unstable.
[root@wmeng203harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker
NAME                      AGE
ansible-service-broker    2h
template-service-broker   2h
[root@wmeng203harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get clusterservicebrokers.servicecatalog.k8s.io)
[root@wmeng203harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker
NAME                      AGE
ansible-service-broker    2h
template-service-broker   2h

2. successful upgrade with same config
Log: https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Run-Ansible-Playbooks-Nextge/324/consoleFull

Unstable, too
[root@wmeng202harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker ansible-service-broker
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get clusterservicebrokers.servicecatalog.k8s.io ansible-service-broker)
[root@wmeng202harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker ansible-service-broker
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get clusterservicebrokers.servicecatalog.k8s.io ansible-service-broker)
[root@wmeng202harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker ansible-service-broker
NAME                     AGE
ansible-service-broker   2h
[root@wmeng202harpm39-master-etcd-1 ~]# oc get ClusterServiceBroker 
NAME                      AGE
ansible-service-broker    2h
template-service-broker   2h

actually both cluster are not stable for "oc get ClusterServiceBroker", the successful one is just lucky when executing the command.

Comment 4 Weihua Meng 2019-06-20 13:18:58 UTC
Hi, Shawn
Could you take a look at this? 
Thanks.

Comment 5 Shawn Hurley 2019-06-20 16:03:55 UTC
If your install never worked, then it makes sens the upgrade could have problems. 

I think that this looks VERY similar to https://bugzilla.redhat.com/show_bug.cgi?id=1720581 and I think we should try to find the root cause of the gcp clusters that have problematic aggregated api servers during the openshift-ansible install. Can you please make note of anything that your doing that is specific to GCP?

Comment 6 Joseph Callen 2019-06-20 18:31:32 UTC
Probably related to MTU issue.  Once https://github.com/openshift/openshift-ansible/pull/11707 is merged will cherry pick for previous versions.

Comment 9 Weihua Meng 2019-07-12 13:17:29 UTC
Fixed.
openshift-ansible-3.10.153-1.git.0.2363fa8.el7
upgrade success.

Comment 11 errata-xmlrpc 2019-07-24 13:47:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1755