Description of problem: After upgrade from v3.9 to v3.10. Checked that node-config has been updated but no backup file created before the change. This should be required because config file is needed for downgrade. [root@upgrade-slave-1-liujia 9663]# ansible -i hosts nodes -m shell -a "ls -la /etc/origin/node/ |grep node-config" jliu-dr-master-etcd-3.0726-la0.qe.rhcloud.com | SUCCESS | rc=0 >> -rw-r--r--. 1 root root 1683 Jul 27 02:39 bootstrap-node-config.yaml -rw-------. 1 root root 1820 Jul 27 03:54 node-config.yaml [root@jliu-dr-master-etcd-1 ~]# diff /etc/origin/node/node-config.yaml /root/etc_bak/origin/node/node-config.yaml 0a1 > allowDisabledDocker: false 2,6d2 < authConfig: < authenticationCacheSize: 1000 < authenticationCacheTTL: 5m < authorizationCacheSize: 1000 < authorizationCacheTTL: 5m 8,10d3 < dnsDomain: cluster.local < dnsIP: 0.0.0.0 < dnsNameservers: null 11a5,6 > dnsDomain: cluster.local > dnsIP: 10.240.0.11 13,16c8,9 < dockerShimRootDirectory: /var/lib/dockershim < dockerShimSocket: /var/run/dockershim.sock < execHandlerName: native < enableUnidling: true --- > execHandlerName: "" > iptablesSyncPeriod: "30s" 19,20c12 < latest: false < iptablesSyncPeriod: 30s --- > latest: False 22,26c14 < kubeletArguments: < bootstrap-kubeconfig: < - /etc/origin/node/bootstrap.kubeconfig < cert-dir: < - /etc/origin/node/certificates --- > kubeletArguments: 33,34d20 < feature-gates: < - RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true 46,48c32 < - node-role.kubernetes.io/master=true < pod-manifest-path: < - /etc/origin/node/pods --- > - role=node 51,52d34 < rotate-certificates: < - 'true' 55d36 < burst: 40 57,58c38,43 < qps: 20 < masterKubeConfig: node.kubeconfig --- > burst: 200 > qps: 100 > masterKubeConfig: system:node:jliu-dr-master-etcd-1.kubeconfig > networkPluginName: redhat/openshift-ovs-subnet > # networkConfig struct introduced in origin 1.0.6 and OSE 3.0.2 which > # deprecates networkPluginName above. The two should match. 60,61c45,48 < mtu: 1410 < networkPluginName: redhat/openshift-ovs-subnet --- > mtu: 1410 > networkPluginName: redhat/openshift-ovs-subnet > nodeName: jliu-dr-master-etcd-1 > podManifestConfig: 64,67c51,57 < bindNetwork: tcp4 < certFile: '' < clientCA: client-ca.crt < keyFile: '' --- > certFile: server.crt > clientCA: ca.crt > keyFile: server.key > volumeDirectory: /var/lib/origin/openshift.local.volumes > proxyArguments: > proxy-mode: > - iptables 70,71c60 < perFSGroup: null < volumeDirectory: /var/lib/origin/openshift.local.volumes --- > perFSGroup: Version-Release number of the following components: openshift-ansible-3.10.21-1.git.0.6446011.el7.noarch How reproducible: always Steps to Reproduce: 1. Run upgrade from v3.9 to v3.10 2. 3. Actual results: node-config.yml was not backup Expected results: node-config.yml should be backup Additional info: Please attach logs from ansible-playbook with the -vvv flag
Yeah, I think we should also backup /etc/systemd/system/atomic-openshift-*.service as well. I think those are the only gaps.
We should fix this. We need to ensure that subsequent 3.10 minor upgrades don't overwrite the 'actual' node-config from 3.9.
PR: https://github.com/openshift/openshift-ansible/pull/11396
Version: openshift-ansible-3.10.139-1.git.0.02bc5db.el7.noarch Before upgrade: [root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/origin/node/ |grep node-config" ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >> -rw-------. 1 root root 1619 Apr 18 05:01 node-config.yaml ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >> -rw-------. 1 root root 1579 Apr 18 05:01 node-config.yaml [root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/systemd/system/ |grep atomic" ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >> -rw-r--r--. 1 root root 1170 Apr 18 05:01 atomic-openshift-node.service ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >> lrwxrwxrwx. 1 root root 9 Apr 18 04:57 atomic-openshift-master.service -> /dev/null -rw-r--r--. 1 root root 1170 Apr 18 05:01 atomic-openshift-node.service drwxr-xr-x. 2 root root 49 Apr 18 04:57 atomic-openshift-node.service.wants After upgrade: [root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/origin/node/ |grep node-config" ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >> -rw-r--r--. 1 root root 1731 Apr 18 05:52 bootstrap-node-config.yaml -rw-------. 1 root root 1876 Apr 18 05:52 node-config.yaml -rw-r--r--. 1 root root 1619 Apr 18 05:01 node-config-yaml.bak-20190418T052317 ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >> -rw-r--r--. 1 root root 1691 Apr 18 05:29 bootstrap-node-config.yaml -rw-------. 1 root root 1870 Apr 18 05:30 node-config.yaml -rw-r--r--. 1 root root 1579 Apr 18 05:01 node-config-yaml.bak-20190418T052316 [root@preserve-jliu-worker 59733]# ansible -i hosts nodes -m shell -a "ls -la /etc/systemd/system/ |grep atomic" ec2-34-238-138-13.compute-1.amazonaws.com | CHANGED | rc=0 >> -rw-r--r--. 1 root root 544 Apr 18 05:52 atomic-openshift-node.service -rw-r--r--. 1 root root 1170 Apr 18 05:01 atomic-openshift-node-service.bak-20190418T052317 ec2-35-168-12-128.compute-1.amazonaws.com | CHANGED | rc=0 >> lrwxrwxrwx. 1 root root 9 Apr 18 04:57 atomic-openshift-master.service -> /dev/null -rw-r--r--. 1 root root 0 Apr 18 04:50 atomic-openshift-master-service.bak-20190418T052316 -rw-r--r--. 1 root root 544 Apr 18 05:29 atomic-openshift-node.service -rw-r--r--. 1 root root 544 Apr 18 05:29 atomic-openshift-node-service.bak-20190418T052316 drwxr-xr-x. 2 root root 49 Apr 18 04:57 atomic-openshift-node.service.wants # diff node-config.yaml node-config-yaml.bak-20190418T052317 0a1 > allowDisabledDocker: false 2,6d2 < authConfig: < authenticationCacheSize: 1000 < authenticationCacheTTL: 5m < authorizationCacheSize: 1000 < authorizationCacheTTL: 5m 8,10d3 < dnsDomain: cluster.local < dnsIP: 0.0.0.0 < dnsNameservers: null 11a5,6 > dnsDomain: cluster.local > dnsIP: 172.18.0.148 13,16c8,9 < dockerShimRootDirectory: /var/lib/dockershim < dockerShimSocket: /var/run/dockershim.sock < execHandlerName: native < enableUnidling: true --- > execHandlerName: "" > iptablesSyncPeriod: "30s" 19,20c12 < latest: false < iptablesSyncPeriod: 30s --- > latest: False 22,26c14 < kubeletArguments: < bootstrap-kubeconfig: < - /etc/origin/node/bootstrap.kubeconfig < cert-dir: < - /etc/origin/node/certificates --- > kubeletArguments: 33,34d20 < feature-gates: < - RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true 46,48c32,34 < - role=node,registry=enabled,router=enabled < pod-manifest-path: < - /etc/origin/node/pods --- > - router=enabled > - role=node > - registry=enabled 51,52d36 < rotate-certificates: < - 'true' 55d38 < burst: 40 57,58c40,45 < qps: 20 < masterKubeConfig: node.kubeconfig --- > burst: 200 > qps: 100 > masterKubeConfig: system:node:ip-172-18-0-148.ec2.internal.kubeconfig > networkPluginName: redhat/openshift-ovs-subnet > # networkConfig struct introduced in origin 1.0.6 and OSE 3.0.2 which > # deprecates networkPluginName above. The two should match. 60,64c47,50 < mtu: 8951 < networkPluginName: redhat/openshift-ovs-subnet < proxyArguments: < cluster-cidr: < - 10.128.0.0/14 --- > mtu: 8951 > networkPluginName: redhat/openshift-ovs-subnet > nodeName: ip-172-18-0-148.ec2.internal > podManifestConfig: 67,70c53,59 < bindNetwork: tcp4 < certFile: '' < clientCA: client-ca.crt < keyFile: '' --- > certFile: server.crt > clientCA: ca.crt > keyFile: server.key > volumeDirectory: /var/lib/origin/openshift.local.volumes > proxyArguments: > proxy-mode: > - iptables 73,74c62 < perFSGroup: null < volumeDirectory: /var/lib/origin/openshift.local.volumes --- > perFSGroup:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0786