Bug 1372594 - The SchedulingDisabled nodes are schedulable after upgrade
Summary: The SchedulingDisabled nodes are schedulable after upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Scott Dodson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks: 1375718
TreeView+ depends on / blocked
 
Reported: 2016-09-02 07:48 UTC by Anping Li
Modified: 2017-03-08 18:26 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously nodes had their schedulability state reset to the state defined in the inventory used during an upgrade. If the scheduling state had been modified since the inventory file was created this would be quite a surprise to admins. We have modified the upgrade process to preserve the current schedulability state during upgrade so that nodes do not change state after an upgrade.
Clone Of:
: 1375718 (view as bug list)
Environment:
Last Closed: 2016-09-27 09:47:19 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1933 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.3 Release Advisory 2016-09-27 13:24:36 UTC

Description Anping Li 2016-09-02 07:48:13 UTC
Description of problem:
The SchedulingDisabled nodes are schedulable after upgrade. By default, all nodes with masters are SchedulingDisabled after installation.

Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.3.20-1.git.0.d15a8dc.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. install two nodes OSE 3.2 and check node status.
[root@anli-working host6]# oc get nodes
NAME                      STATUS                     AGE
host6master.example.com   Ready,SchedulingDisabled   41d
host6node.example.com     Ready                      41d

2. upgrade to OCP 3.3. and check node status.
[root@anli-working host6]# oc get nodes
NAME                      STATUS                     AGE
host6master.example.com   Ready                      41d
host6node.example.com     Ready                      41d

Actual results:


Expected results:
The node schedule status is same as before.

Additional info:

Comment 1 Scott Dodson 2016-09-02 20:04:58 UTC
Hi Anping,

We weren't able to reproduce the scenario you've described, where the master was set to schedulable when it wasn't prior to the upgrade. Can you share your inventory? The only way I can think that you'd get the results you got is if the master had openshift_schedulable=true in the inventory but then was manually set unschedulable befor the upgrade.


Regardless, I think the previous behavior could have left an environment in "correct" but unexpected state where everything is reset to the values in the inventory rather than how things were prior to upgrading. Because of this we've implemented a change that records the node's schedulability prior to the upgrade process and will restore the node to that state after the upgrade ignoring what was in the inventory. I think this is ultimately the right thing to do even if it's not strictly doing what's defined in the inventory.

https://github.com/openshift/openshift-ansible/pull/2406

What do you think?

Comment 3 Anping Li 2016-09-05 09:20:32 UTC
Scott, The fix works well. yes, for upgrade, it is better to follow the rule Only modify which must be modified.

Comment 5 errata-xmlrpc 2016-09-27 09:47:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933


Note You need to log in before you can comment on or make changes to this bug.