Bug 1634700

Summary: [3.11] Modification in master-config.yaml is causing a failure during Master-API restart (runtime-config)
Product: OpenShift Container Platform Reporter: Simon Reber <sreber>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: ge liu <geliu>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, cshereme, geliu, gpei, jokerman, mmccomas, rhowe, sdodson
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, when upgrading from earlier versions of the product the step that removes pod presets configuration may have resulted in a master-config.yaml file that failed to parse properly. This error has been corrected and the master-config.yaml should now be updated properly in all scenarios.
Story Points: ---
Clone Of:
: 1642148 (view as bug list) Environment:
Last Closed: 2018-11-20 03:10:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1642148    

Description Simon Reber 2018-10-01 11:37:27 UTC
Description of problem:

When upgrading from Red Hat OpenShift Container Platform 3.9 to Red Hat OpenShift Container Platform 3.10, the upgrade playbook is failing because `runtime-config: ''` is added to `master-config.yaml as per https://github.com/openshift/openshift-ansible/blob/release-3.10/roles/openshift_control_plane/tasks/upgrade.yml#L132

F0928 10:01:42.736248       1 start_api.go:68] could not load config file "/etc/origin/master/master-config.yaml" due to an error: error reading config: [pos 4311]: only encoded map or array can be decoded into a slice (6)
[root@master01 ~]# master-logs controllers controllers
F0928 10:01:42.737000       1 start_controllers.go:67] could not load config file "/etc/origin/master/master-config.yaml" due to an error: error reading config: [pos 4311]: only encoded map or array can be decoded into a slice (6)

When checking, it was found that `runtime-config: ''` needs to be replaced with `runtime-config: []` as this is what is expected. Once this is done manually, everything is started and the upgrade processes without issue.

Version-Release number of selected component (if applicable):

 - atomic-openshift-3.10.45-1.git.0.3b98bf6.el7

How reproducible:

 - Always, when `runtime-config` is set to `apis/settings.k8s.io/v1alpha1=true`

Steps to Reproduce:
1. Upgrading from Red Hat OpenShift Container Platform 3.9 to 3.10 as per https://docs.openshift.com/container-platform/3.10/upgrading/automated_upgrades.html#upgrading-control-plane-nodes-separate-phases

Actual results:

Upgrade is failing as the Master-API does not restart because of syntax failure.

F0928 10:01:42.736248       1 start_api.go:68] could not load config file "/etc/origin/master/master-config.yaml" due to an error: error reading config: [pos 4311]: only encoded map or array can be decoded into a slice (6)
[root@master01 ~]# master-logs controllers controllers
F0928 10:01:42.737000       1 start_controllers.go:67] could not load config file "/etc/origin/master/master-config.yaml" due to an error: error reading config: [pos 4311]: only encoded map or array can be decoded into a slice (6)

Expected results:

No error reported and upgrade to complete without an issue being reported

Additional info:

https://bugzilla.redhat.com/show_bug.cgi?id=1633383 could be of interest here as it also talk about this part of the configuration.

Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 4 Scott Dodson 2018-10-16 20:25:26 UTC
https://github.com/openshift/openshift-ansible/pull/10423 proposed fix

Comment 5 Scott Dodson 2018-10-18 12:37:38 UTC
*** Bug 1633383 has been marked as a duplicate of this bug. ***

Comment 6 Scott Dodson 2018-10-23 18:59:56 UTC
https://github.com/openshift/openshift-ansible/pull/10495 release-3.11 backport

Comment 7 ge liu 2018-11-01 10:07:09 UTC
Upgrade from 3.10 to 3.11 successfully, 
In 3.10 env, master-config.yaml have not runtime-config items already, and after upgrade to 3.11(v3.11.36), there is not runtime-config exist yet.

version: atomic-openshift-3.11.36-1.git.0.9c078f1.el7.x86_64

Comment 9 ge liu 2018-11-02 10:02:54 UTC
There is still problem in upgrade.yml, :
########################################################################
- name: Find current value for runtime-config
  yedit: 
    src: "/tmp/master-config.yaml"    ====>It should be '{{ openshift.common.config_base }}/master/'
    key: "kubernetesMasterConfig.apiServerArguments.runtime-config"
    state: list
  register: runtime_config
- name: Set the runtime-config to exclude pod presets
####################################################################
 "/tmp/master-config.yaml" is not wrong directory, so it will not list the runtime_config, then the next steps will be skill by playbook, so this error need to be executed.

Comment 10 Scott Dodson 2018-11-02 12:24:29 UTC
Nice catch, thanks.

https://github.com/openshift/openshift-ansible/pull/10583

Comment 11 Scott Dodson 2018-11-07 13:59:35 UTC
In openshift-ansible-3.11.39-1 and later

Comment 12 ge liu 2018-11-09 05:03:44 UTC
Verified.

openshift-ansible-3.11.41-1.git.0.f711b2d.el7.noarch

after upgrade 3.9-->3.10-->3.11, the runtime_config item exists in master-config.yaml, and the value is:[], It's work as design according to ansible playbook, but I have a question: if upgrade from fresh install 3.10->3.11, there is not runtime_config item in master_config.yaml, but upgrade env will have it, so is this difference make any effect for ocp? thx

Comment 13 Scott Dodson 2018-11-09 16:19:33 UTC
The main thing we're interested in fixing is that no matter the starting state the api server is able to parse the configuration file and run successfully. I think what you're describing indicates there's no problem, right?

Comment 14 ge liu 2018-11-12 02:33:50 UTC
Upgrade is works well, Wish there is not potential risk from the difference of fresh installed and upgraded env about the runtime_config item.

Comment 16 errata-xmlrpc 2018-11-20 03:10:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3537