Bug 1609191 - Need to backup node-config file during upgrade which is a must for downgrade
Summary: Need to backup node-config file during upgrade which is a must for downgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.z
Assignee: Patrick Dillon
QA Contact: liujia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-27 09:08 UTC by liujia
Modified: 2019-06-11 09:31 UTC (History)
5 users (show)

Fixed In Version: openshift-ansible-3.10.132-1.git.0.a647280.el7
Doc Type: Bug Fix
Doc Text:
Cause: node-config.yaml was not being backed up and was overwritten when upgrading from 3.9 to 3.10. Consequence: downgrade from 3.10 to 3.9 was impossible. Fix: Backup node-config.yaml and atomic-openshift systemd files when performing upgrade. Result: downgrade from 3.10 to 3.9 is possible.
Clone Of:
Environment:
Last Closed: 2019-06-11 09:30:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0786 0 None None None 2019-06-11 09:30:59 UTC

Description liujia 2018-07-27 09:08:12 UTC
Description of problem:
After upgrade from v3.9 to v3.10. Checked that node-config has been updated but no backup file created before the change. This should be required because config file is needed for downgrade.

[root@upgrade-slave-1-liujia 9663]# ansible -i hosts nodes -m shell -a "ls -la /etc/origin/node/ |grep node-config"
jliu-dr-master-etcd-3.0726-la0.qe.rhcloud.com | SUCCESS | rc=0 >>
-rw-r--r--. 1 root root 1683 Jul 27 02:39 bootstrap-node-config.yaml
-rw-------. 1 root root 1820 Jul 27 03:54 node-config.yaml

[root@jliu-dr-master-etcd-1 ~]# diff /etc/origin/node/node-config.yaml /root/etc_bak/origin/node/node-config.yaml 
0a1
> allowDisabledDocker: false
2,6d2
< authConfig:
<   authenticationCacheSize: 1000
<   authenticationCacheTTL: 5m
<   authorizationCacheSize: 1000
<   authorizationCacheTTL: 5m
8,10d3
< dnsDomain: cluster.local
< dnsIP: 0.0.0.0
< dnsNameservers: null
11a5,6
> dnsDomain: cluster.local
> dnsIP: 10.240.0.11
13,16c8,9
<   dockerShimRootDirectory: /var/lib/dockershim
<   dockerShimSocket: /var/run/dockershim.sock
<   execHandlerName: native
< enableUnidling: true
---
>   execHandlerName: ""
> iptablesSyncPeriod: "30s"
19,20c12
<   latest: false
< iptablesSyncPeriod: 30s
---
>   latest: False
22,26c14
< kubeletArguments:
<   bootstrap-kubeconfig:
<   - /etc/origin/node/bootstrap.kubeconfig
<   cert-dir:
<   - /etc/origin/node/certificates
---
> kubeletArguments: 
33,34d20
<   feature-gates:
<   - RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true
46,48c32
<   - node-role.kubernetes.io/master=true
<   pod-manifest-path:
<   - /etc/origin/node/pods
---
>   - role=node
51,52d34
<   rotate-certificates:
<   - 'true'
55d36
<   burst: 40
57,58c38,43
<   qps: 20
< masterKubeConfig: node.kubeconfig
---
>   burst: 200
>   qps: 100
> masterKubeConfig: system:node:jliu-dr-master-etcd-1.kubeconfig
> networkPluginName: redhat/openshift-ovs-subnet
> # networkConfig struct introduced in origin 1.0.6 and OSE 3.0.2 which
> # deprecates networkPluginName above. The two should match.
60,61c45,48
<   mtu: 1410
<   networkPluginName: redhat/openshift-ovs-subnet
---
>    mtu: 1410
>    networkPluginName: redhat/openshift-ovs-subnet
> nodeName: jliu-dr-master-etcd-1
> podManifestConfig:
64,67c51,57
<   bindNetwork: tcp4
<   certFile: ''
<   clientCA: client-ca.crt
<   keyFile: ''
---
>   certFile: server.crt
>   clientCA: ca.crt
>   keyFile: server.key
> volumeDirectory: /var/lib/origin/openshift.local.volumes
> proxyArguments:
>   proxy-mode:
>      - iptables
70,71c60
<     perFSGroup: null
< volumeDirectory: /var/lib/origin/openshift.local.volumes
---
>     perFSGroup: 


Version-Release number of the following components:
openshift-ansible-3.10.21-1.git.0.6446011.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Run upgrade from v3.9 to v3.10
2.
3.

Actual results:
node-config.yml was not backup

Expected results:
node-config.yml should be backup

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Scott Dodson 2018-07-27 12:29:42 UTC
Yeah, I think we should also backup /etc/systemd/system/atomic-openshift-*.service as well.

I think those are the only gaps.

Comment 2 Michael Gugino 2018-11-29 21:31:04 UTC
We should fix this.  We need to ensure that subsequent 3.10 minor upgrades don't overwrite the 'actual' node-config from 3.9.

Comment 3 Patrick Dillon 2019-03-25 16:47:08 UTC
PR: https://github.com/openshift/openshift-ansible/pull/11396

Comment 8 liujia 2019-04-18 09:58:37 UTC
Version:
openshift-ansible-3.10.139-1.git.0.02bc5db.el7.noarch

Before upgrade:
[root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/origin/node/ |grep node-config"
ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >>
-rw-------. 1 root root 1619 Apr 18 05:01 node-config.yaml

ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >>
-rw-------. 1 root root 1579 Apr 18 05:01 node-config.yaml

[root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/systemd/system/ |grep atomic"
ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >>
-rw-r--r--.  1 root root 1170 Apr 18 05:01 atomic-openshift-node.service

ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >>
lrwxrwxrwx.  1 root root    9 Apr 18 04:57 atomic-openshift-master.service -> /dev/null
-rw-r--r--.  1 root root 1170 Apr 18 05:01 atomic-openshift-node.service
drwxr-xr-x.  2 root root   49 Apr 18 04:57 atomic-openshift-node.service.wants

After upgrade:
[root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/origin/node/ |grep node-config"
ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >>
-rw-r--r--. 1 root root 1731 Apr 18 05:52 bootstrap-node-config.yaml
-rw-------. 1 root root 1876 Apr 18 05:52 node-config.yaml
-rw-r--r--. 1 root root 1619 Apr 18 05:01 node-config-yaml.bak-20190418T052317

ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >>
-rw-r--r--. 1 root root 1691 Apr 18 05:29 bootstrap-node-config.yaml
-rw-------. 1 root root 1870 Apr 18 05:30 node-config.yaml
-rw-r--r--. 1 root root 1579 Apr 18 05:01 node-config-yaml.bak-20190418T052316

[root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/systemd/system/ |grep atomic"
ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >>
-rw-r--r--.  1 root root  544 Apr 18 05:52 atomic-openshift-node.service
-rw-r--r--.  1 root root 1170 Apr 18 05:01 atomic-openshift-node-service.bak-20190418T052317

ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >>
lrwxrwxrwx.  1 root root    9 Apr 18 04:57 atomic-openshift-master.service -> /dev/null
-rw-r--r--.  1 root root    0 Apr 18 04:50 atomic-openshift-master-service.bak-20190418T052316
-rw-r--r--.  1 root root  544 Apr 18 05:29 atomic-openshift-node.service
-rw-r--r--.  1 root root  544 Apr 18 05:29 atomic-openshift-node-service.bak-20190418T052316
drwxr-xr-x.  2 root root   49 Apr 18 04:57 atomic-openshift-node.service.wants


# diff node-config.yaml node-config-yaml.bak-20190418T052317 
0a1
> allowDisabledDocker: false
2,6d2
< authConfig:
<   authenticationCacheSize: 1000
<   authenticationCacheTTL: 5m
<   authorizationCacheSize: 1000
<   authorizationCacheTTL: 5m
8,10d3
< dnsDomain: cluster.local
< dnsIP: 0.0.0.0
< dnsNameservers: null
11a5,6
> dnsDomain: cluster.local
> dnsIP: 172.18.0.148
13,16c8,9
<   dockerShimRootDirectory: /var/lib/dockershim
<   dockerShimSocket: /var/run/dockershim.sock
<   execHandlerName: native
< enableUnidling: true
---
>   execHandlerName: ""
> iptablesSyncPeriod: "30s"
19,20c12
<   latest: false
< iptablesSyncPeriod: 30s
---
>   latest: False
22,26c14
< kubeletArguments:
<   bootstrap-kubeconfig:
<   - /etc/origin/node/bootstrap.kubeconfig
<   cert-dir:
<   - /etc/origin/node/certificates
---
> kubeletArguments: 
33,34d20
<   feature-gates:
<   - RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true
46,48c32,34
<   - role=node,registry=enabled,router=enabled
<   pod-manifest-path:
<   - /etc/origin/node/pods
---
>   - router=enabled
>   - role=node
>   - registry=enabled
51,52d36
<   rotate-certificates:
<   - 'true'
55d38
<   burst: 40
57,58c40,45
<   qps: 20
< masterKubeConfig: node.kubeconfig
---
>   burst: 200
>   qps: 100
> masterKubeConfig: system:node:ip-172-18-0-148.ec2.internal.kubeconfig
> networkPluginName: redhat/openshift-ovs-subnet
> # networkConfig struct introduced in origin 1.0.6 and OSE 3.0.2 which
> # deprecates networkPluginName above. The two should match.
60,64c47,50
<   mtu: 8951
<   networkPluginName: redhat/openshift-ovs-subnet
< proxyArguments:
<   cluster-cidr:
<   - 10.128.0.0/14
---
>    mtu: 8951
>    networkPluginName: redhat/openshift-ovs-subnet
> nodeName: ip-172-18-0-148.ec2.internal
> podManifestConfig:
67,70c53,59
<   bindNetwork: tcp4
<   certFile: ''
<   clientCA: client-ca.crt
<   keyFile: ''
---
>   certFile: server.crt
>   clientCA: ca.crt
>   keyFile: server.key
> volumeDirectory: /var/lib/origin/openshift.local.volumes
> proxyArguments:
>   proxy-mode:
>      - iptables
73,74c62
<     perFSGroup: null
< volumeDirectory: /var/lib/origin/openshift.local.volumes
---
>     perFSGroup:

Comment 10 errata-xmlrpc 2019-06-11 09:30:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0786


Note You need to log in before you can comment on or make changes to this bug.