Bug 1668317 - Packages in dedicated ETCD nodes are not upgraded
Summary: Packages in dedicated ETCD nodes are not upgraded
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.10.0
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
: 3.11.z
Assignee: Russell Teague
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks: 1693524
TreeView+ depends on / blocked
 
Reported: 2019-01-22 12:42 UTC by Arnab Ghosh
Modified: 2019-07-30 15:22 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: etcd hosts which were not masters were excluded from upgrades Consequence: etcd hosts which were nodes were not upgraded Fix: only exclude etcd hosts if they are both not masters nor nodes Result: etcd hosts are upgraded if they are also nodes The term 'dedicated' etcd host was intended that the host was in fact dedicated to only running etcd and was also not a node.
Clone Of:
: 1693524 (view as bug list)
Environment:
Last Closed: 2019-04-11 05:38:26 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0636 None None None 2019-04-11 05:38:35 UTC

Description Arnab Ghosh 2019-01-22 12:42:44 UTC
Description of problem:

Customer upgraded his cluster from OCP 3.9 to 3.10. After upgrading kubernetes version of dedicated ETCD nodes were different than all other nodes.

# oc get nodes
NAME                             STATUS    ROLES     AGE       VERSION
ocpetcdkk001.srv.muenchen.de     Ready     compute   2d        v1.9.1+a0ce1bc657
ocpetcdkk002.srv.muenchen.de     Ready     compute   2d        v1.9.1+a0ce1bc657
ocpetcdkk003.srv.muenchen.de     Ready     compute   2d        v1.9.1+a0ce1bc657
ocpmasterkk001.srv.muenchen.de   Ready     master    2d        v1.10.0+b81c8f8
ocpmasterkk002.srv.muenchen.de   Ready     master    2d        v1.10.0+b81c8f8
ocpmasterkk003.srv.muenchen.de   Ready     master    2d        v1.10.0+b81c8f8
ocpnodekk001.srv.muenchen.de     Ready     compute   2d        v1.10.0+b81c8f8
ocpnodekk002.srv.muenchen.de     Ready     compute   2d        v1.10.0+b81c8f8
ocpnodekk003.srv.muenchen.de     Ready     compute   2d        v1.10.0+b81c8f8

After reviewing Ansible playbook for openshift upgrade, we noticed following play.

- name: Evaluate oo_nodes_to_upgrade
    add_host:
      name: "{{ item }}"
      groups: oo_nodes_to_upgrade
      ansible_ssh_user: "{{ g_ssh_user | default(omit) }}"
      ansible_become: "{{ g_sudo | default(omit) }}"
    when: item not in dedicated_etcds
    vars:
      dedicated_etcds: "{{ groups['oo_etcd_to_config'] | difference(groups['oo_masters']) }}"
    with_items: "{{ groups['temp_nodes_to_upgrade'] | default(groups['oo_nodes_to_config']) }}"
changed_when: False

From above play we understood if ETCD runs on a dedicated host then packages for those hosts will not be upgraded. The issue got resolved on modifying above play as below.

- name: Evaluate oo_nodes_to_upgrade
    add_host:
      name: "{{ item }}"
      groups: oo_nodes_to_upgrade
      ansible_ssh_user: "{{ g_ssh_user | default(omit) }}"
      ansible_become: "{{ g_sudo | default(omit) }}"
    when: item in dedicated_etcds
    vars:
      dedicated_etcds: "{{ groups['oo_etcd_to_config'] | difference(groups['oo_masters']) }}"
    with_items: "{{ groups['temp_nodes_to_upgrade'] | default(groups['oo_nodes_to_config']) }}"
changed_when: False

Version-Release number of the following components:
rpm -q openshift-ansible
3.10.73

How reproducible:
Always

Steps to Reproduce:
1. Keep dedicated ETCD host in your inventory
2. Upgrade cluster from OCP 3.9 to OCP 3.10
3.

Actual results:
No Error

Expected results:
All nodes should have same kubelet and kube-proxy version

Let me know if you need upgrade log, can't upload here since it is large.

Comment 1 Scott Dodson 2019-01-22 14:39:22 UTC
Please provide complete inventory, verbose logs, versions of ansible and openshift-ansible in use.

I'm mainly interested in which groups the etcd hosts are members of but the other information will be useful too and should be provided for all bugs.

Comment 6 Russell Teague 2019-03-11 14:11:37 UTC
Proposed: https://github.com/openshift/openshift-ansible/pull/11335

Comment 9 ge liu 2019-04-01 02:55:47 UTC
Verified with 3.11.98, I suppose this solution is ok based on comment 8 since there is not extra comment. thanks.

Comment 11 errata-xmlrpc 2019-04-11 05:38:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0636

Comment 13 Russell Teague 2019-07-30 15:22:22 UTC
I'm not sure what you mean by "Etcd dedicated servers are using openshift 3.9 version".  Generally, if etcd is on dedicated servers, there are no openshift components installed.  If etcd is running on openshift nodes, they should be upgraded with other nodes.

Please do not comment on closed bugs for new issues.  To track this issue, open a new bug and attached logs and inventory.  This bug can be referenced for context.


Note You need to log in before you can comment on or make changes to this bug.