Description of problem: Upgrade OCP3.4 to 3.5 while petset created in cluster due to petset is not supported in 3.5(Kubernetes version 1.5). fatal: [x.x.x.x -> x.x.x.x]: FAILED! => { "changed": true, "cmd": [ "/usr/local/bin/oadm", "drain", "ip-172-18-14-148.ec2.internal", "--force", "--delete-local-data", "--ignore-daemonsets" ], "delta": "0:00:04.205678", "end": "2017-03-01 05:53:54.861837", "failed": true, "invocation": { "module_args": { "_raw_params": "/usr/local/bin/oadm drain ip-172-18-14-148.ec2.internal --force --delete-local-data --ignore-daemonsets", "_uses_shell": false, "chdir": null, "creates": null, "executable": null, "removes": null, "warn": true }, "module_name": "command" }, "rc": 1, "start": "2017-03-01 05:53:50.656159", "warnings": [] } STDOUT: node "ip-172-18-14-148.ec2.internal" already cordoned STDERR: error: Unknown controller kind "PetSet": hello-petset-1, hello-petset-1 Version-Release number of selected component (if applicable): atomic-openshift-utils-3.5.17-1.git.0.561702e.el7.noarch How reproducible: always Steps to Reproduce: 1. Install ocp 3.4. 2. Create a petset. 3. Upgrade ocp3.4 to 3.5. Actual results: Fail to upgrade at task [Drain Node for Kubelet upgrade]. Expected results: Upgrade with petset successfully Additional info: https://kubernetes.io/docs/tasks/manage-stateful-set/upgrade-pet-set-to-stateful-set/
Clayton, what should we be doing about pet sets during upgrades where we drain nodes?
Asking around on aos-devel to get broader input on this.
Consensus from the mailing list: Migrating petsets to stateful sets is out of scope and will not be supported in the OpenShift-Ansible 3.4->3.5 upgrade playbooks. These features were never officially supported, no statements about them ever provided. Furthermore, petsets were only ever ALPHA status in their supported kube version. As StatefulSets are still beta resources in kube 1.5 they may also not be supported in version migrations when 3.6 is released (pending future official support communications, of course). I am working on a patch now that runs during upgrade pre-validation which will detect existing petsets. Users will be given a helpful message, clarifying non-support of the petset feature, and a reference to the official kube migration docs will be provided.
Merged into master. Please re-test.
Created attachment 1261429 [details] The upgrade logs when use petset Failed to skip petset 1. oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/b90a5a05c6af96b8e94085822e723ef7be57fe5b/petset/hello-petset.yaml 2. run upgrade.yml fatal: [openshift-225.lab.eng.nay.redhat.com -> openshift-223.lab.eng.nay.redhat.com]: FAILED! => { "changed": true, "cmd": [ "/usr/local/bin/oadm", "drain", "openshift-225.lab.eng.nay.redhat.com", "--force", "--delete-local-data", "--ignore-daemonsets" ], "delta": "0:00:00.350344", "end": "2017-03-08 23:32:31.383682", "failed": true, "invocation": { "module_args": { "_raw_params": "/usr/local/bin/oadm drain openshift-225.lab.eng.nay.redhat.com --force --delete-local-data --ignore-daemonsets", "_uses_shell": false, "chdir": null, "creates": null, "executable": null, "removes": null, "warn": true }, "module_name": "command" }, "rc": 1, "start": "2017-03-08 23:32:31.033338", "warnings": [] } STDOUT: node "openshift-225.lab.eng.nay.redhat.com" already cordoned STDERR: error: Unknown controller kind "PetSet": hello-petset-0, hello-petset-0, hello-petset-1, hello-petset-1
I believe I've fixed the issue you ran into. I had an incorrect comparison test in the original PR. This works better https://github.com/openshift/openshift-ansible/pull/3623 The other issue you see mentioned in there is unrelated to my changes.
Yes, I can see the fixed task was executed. But not sure why "skipped" is true for TASK [FAIL ON Resource migration 'PetSets' unsupported] **** TASK [Check if legacy PetSets exist] ******************************************* task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/v3_5/validator.yml:31 Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py <openshift-223.lab.eng.nay.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root <openshift-223.lab.eng.nay.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r openshift-223.lab.eng.nay.redhat.com '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639 `" && echo ansible-tmp-1489033498.92-73057137625639="` echo ~/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639 `" ) && sleep 0'"'"'' <openshift-223.lab.eng.nay.redhat.com> PUT /tmp/tmphkwjKv TO /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/oc_obj.py <openshift-223.lab.eng.nay.redhat.com> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r '[openshift-223.lab.eng.nay.redhat.com]' <openshift-223.lab.eng.nay.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root <openshift-223.lab.eng.nay.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r openshift-223.lab.eng.nay.redhat.com '/bin/sh -c '"'"'chmod u+x /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/ /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/oc_obj.py && sleep 0'"'"'' <openshift-223.lab.eng.nay.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root <openshift-223.lab.eng.nay.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt openshift-223.lab.eng.nay.redhat.com '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/oc_obj.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/" > /dev/null 2>&1 && sleep 0'"'"'' ok: [openshift-223.lab.eng.nay.redhat.com] => { "changed": false, "invocation": { "module_args": { "all_namespaces": true, "content": null, "debug": false, "delete_after": false, "files": null, "force": false, "kind": "petsets", "kubeconfig": "/etc/origin/master/admin.kubeconfig", "name": null, "namespace": "default", "selector": null, "state": "list" }, "module_name": "oc_obj" }, "results": { "cmd": "/usr/local/bin/oc get petsets -o json --all-namespaces", "results": [ { "apiVersion": "v1", "items": [ { "apiVersion": "apps/v1alpha1", "kind": "PetSet", "metadata": { "creationTimestamp": "2017-03-09T03:29:45Z", "generation": 1, "labels": { "app": "hello-pod" }, "name": "hello-petset", "namespace": "default", "resourceVersion": "5629", "selfLink": "/apis/apps/v1alpha1/namespaces/default/petsets/hello-petset", "uid": "a42a35a6-0478-11e7-932f-fa163e30eba3" }, "spec": { "replicas": 2, "selector": { "matchLabels": { "app": "hello-pod" } }, "serviceName": "foo", "template": { "metadata": { "annotations": { "pod.alpha.kubernetes.io/initialized": "true" }, "creationTimestamp": null, "labels": { "app": "hello-pod" } }, "spec": { "containers": [ { "image": "openshift/hello-openshift:latest", "imagePullPolicy": "IfNotPresent", "name": "hello-pod", "ports": [ { "containerPort": 8080, "protocol": "TCP" } ], "resources": {}, "securityContext": { "capabilities": {}, "privileged": false }, "terminationMessagePath": "/dev/termination-log", "volumeMounts": [ { "mountPath": "/tmp", "name": "tmp" } ] } ], "dnsPolicy": "ClusterFirst", "restartPolicy": "Always", "securityContext": {}, "terminationGracePeriodSeconds": 0, "volumes": [ { "emptyDir": {}, "name": "tmp" } ] } } }, "status": { "replicas": 2 } } ], "kind": "List", "metadata": {} } ], "returncode": 0 }, "state": "list" } TASK [FAIL ON Resource migration 'PetSets' unsupported] ************************ task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/v3_5/validator.yml:38 skipping: [openshift-223.lab.eng.nay.redhat.com] => { "changed": false, "skip_reason": "Conditional check failed", "skipped": true }
https://github.com/openshift/openshift-ansible/pull/3638 Backported to release-1.5 The task name referenced in comment 9 no longer exists so I think we should test with a new build.
fixed in openshift-ansible-3.5.30-1
TASK [Fail on unsupported resource migration 'PetSets'] ************************ fatal: [openshift-224.lab.eng.nay.redhat.com]: FAILED! => { "changed": false, "failed": true } MSG: PetSet objects were detected in your cluster. These are an Alpha feature in upstream Kubernetes 1.4 and are not supported by Red Hat. In Kubernetes 1.5, they are replaced by the Beta feature StatefulSets. Red Hat currently does not offer support for either PetSets or StatefulSets. Automatically migrating PetSets to StatefulSets in OpenShift Container Platform (OCP) 3.5 is not supported. See the Kubernetes "Upgrading from PetSets to StatefulSets" documentation for additional information: https://kubernetes.io/docs/tasks/manage-stateful-set/upgrade-pet-set-to-stateful-set/ PetSets MUST be removed before upgrading to OCP 3.5. Red Hat strongly recommends reading the above referenced documentation in its entirety before taking any destructive actions. If you want to simply remove all PetSets without manually migrating to StatefulSets, run this command as a user with cluster-admin privileges: $ oc get petsets --all-namespaces -o yaml | oc delete -f - --cascade=false to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.retry PLAY RECAP ********************************************************************* localhost : ok=10 changed=0 unreachable=0 failed=0 openshift-223.lab.eng.nay.redhat.com : ok=103 changed=9 unreachable=0 failed=0 openshift-224.lab.eng.nay.redhat.com : ok=131 changed=10 unreachable=0 failed=1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0903