Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1428229 - fail to upgrade ocp3.4 to 3.5 while petset created in cluster
fail to upgrade ocp3.4 to 3.5 while petset created in cluster
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Upgrade (Show other bugs)
3.5.0
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Tim Bielawa
Anping Li
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-03-02 00:26 EST by liujia
Modified: 2017-07-24 10 EDT (History)
7 users (show)

See Also:
Fixed In Version: openshift-ansible-3.5.30-1
Doc Type: Bug Fix
Doc Text:
Cause: K8s resources which are not supported by Red Hat were in an OCP cluster. During the upgrade from 3.4 to 3.5 the k8s resources were deprecated and replaced with a new (unsupported) resource: StatefulSets. Automatic migration is not possible from PetSets to StatefulSets. Consequence: The upgrade will fail because the unsupported resources can not be automatically migrated. Fix: An additional validation step was added to the pre-upgrade validation playbook. PetSets are searched for in the cluster. Result: If any existing PetSets are detected the installation errors and quits. The user is given an information message (including documentation references) describing: what went wrong, why, and what the users choices are for continuing the upgrade without migrating PetSets.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-04-12 15:03:02 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
The upgrade logs when use petset (170.88 KB, application/x-gzip)
2017-03-08 23:59 EST, Anping Li
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0903 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-04-12 18:45:42 EDT

  None (edit)
Description liujia 2017-03-02 00:26:01 EST
Description of problem:
Upgrade OCP3.4 to 3.5 while petset created in cluster due to petset is not supported in 3.5(Kubernetes version 1.5).

    fatal: [x.x.x.x -> x.x.x.x]: FAILED! => {
        "changed": true,
        "cmd": [
            "/usr/local/bin/oadm",
            "drain",
            "ip-172-18-14-148.ec2.internal",
            "--force",
            "--delete-local-data",
            "--ignore-daemonsets"
        ],
        "delta": "0:00:04.205678",
        "end": "2017-03-01 05:53:54.861837",
        "failed": true,
        "invocation": {
            "module_args": {
                "_raw_params": "/usr/local/bin/oadm drain ip-172-18-14-148.ec2.internal --force --delete-local-data --ignore-daemonsets",
                "_uses_shell": false,
                "chdir": null,
                "creates": null,
                "executable": null,
                "removes": null,
                "warn": true
            },
            "module_name": "command"
        },
        "rc": 1,
        "start": "2017-03-01 05:53:50.656159",
        "warnings": []
    }
     
    STDOUT:
     
    node "ip-172-18-14-148.ec2.internal" already cordoned
     
     
    STDERR:
     
    error: Unknown controller kind "PetSet": hello-petset-1, hello-petset-1



Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.5.17-1.git.0.561702e.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Install ocp 3.4.
2. Create a petset. 
3. Upgrade ocp3.4 to 3.5.

Actual results:
Fail to upgrade at task [Drain Node for Kubelet upgrade].

Expected results:
Upgrade with petset successfully

Additional info:
https://kubernetes.io/docs/tasks/manage-stateful-set/upgrade-pet-set-to-stateful-set/
Comment 1 Scott Dodson 2017-03-06 14:30:55 EST
Clayton, what should we be doing about pet sets during upgrades where we drain nodes?
Comment 2 Tim Bielawa 2017-03-06 16:47:10 EST
Asking around on aos-devel to get broader input on this.
Comment 4 Tim Bielawa 2017-03-07 11:03:05 EST
Consensus from the mailing list:

Migrating petsets to stateful sets is out of scope and will not be supported in the OpenShift-Ansible 3.4->3.5 upgrade playbooks.

These features were never officially supported, no statements about them ever provided. Furthermore, petsets were only ever ALPHA status in their supported kube version. As StatefulSets are still beta resources in kube 1.5 they may also not be supported in version migrations when 3.6 is released (pending future official support communications, of course).

I am working on a patch now that runs during upgrade pre-validation which will detect existing petsets. Users will be given a helpful message, clarifying non-support of the petset feature, and a reference to the official kube migration docs will be provided.
Comment 5 Tim Bielawa 2017-03-07 17:00:40 EST
Merged into master. Please re-test.
Comment 7 Anping Li 2017-03-08 23:59 EST
Created attachment 1261429 [details]
The upgrade logs when use petset

Failed to skip petset

1. oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/b90a5a05c6af96b8e94085822e723ef7be57fe5b/petset/hello-petset.yaml

2. run upgrade.yml
fatal: [openshift-225.lab.eng.nay.redhat.com -> openshift-223.lab.eng.nay.redhat.com]: FAILED! => {
    "changed": true,
    "cmd": [
        "/usr/local/bin/oadm",
        "drain",
        "openshift-225.lab.eng.nay.redhat.com",
        "--force",
        "--delete-local-data",
        "--ignore-daemonsets"
    ],
    "delta": "0:00:00.350344",
    "end": "2017-03-08 23:32:31.383682",
    "failed": true,
    "invocation": {
        "module_args": {
            "_raw_params": "/usr/local/bin/oadm drain openshift-225.lab.eng.nay.redhat.com --force --delete-local-data --ignore-daemonsets",
            "_uses_shell": false,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "warn": true
        },
        "module_name": "command"
    },
    "rc": 1,
    "start": "2017-03-08 23:32:31.033338",
    "warnings": []
}

STDOUT:

node "openshift-225.lab.eng.nay.redhat.com" already cordoned


STDERR:

error: Unknown controller kind "PetSet": hello-petset-0, hello-petset-0, hello-petset-1, hello-petset-1
Comment 8 Tim Bielawa 2017-03-10 14:01:23 EST
I believe I've fixed the issue you ran into. I had an incorrect comparison test in the original PR. This works better

https://github.com/openshift/openshift-ansible/pull/3623

The other issue you see mentioned in there is unrelated to my changes.
Comment 9 Anping Li 2017-03-12 22:01:05 EDT
Yes, I can see the fixed task was executed. But not sure why  "skipped" is true for TASK [FAIL ON Resource migration 'PetSets' unsupported] ****

TASK [Check if legacy PetSets exist] *******************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/v3_5/validator.yml:31
Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
<openshift-223.lab.eng.nay.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<openshift-223.lab.eng.nay.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r openshift-223.lab.eng.nay.redhat.com '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639 `" && echo ansible-tmp-1489033498.92-73057137625639="` echo ~/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639 `" ) && sleep 0'"'"''
<openshift-223.lab.eng.nay.redhat.com> PUT /tmp/tmphkwjKv TO /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/oc_obj.py
<openshift-223.lab.eng.nay.redhat.com> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r '[openshift-223.lab.eng.nay.redhat.com]'
<openshift-223.lab.eng.nay.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<openshift-223.lab.eng.nay.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r openshift-223.lab.eng.nay.redhat.com '/bin/sh -c '"'"'chmod u+x /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/ /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/oc_obj.py && sleep 0'"'"''
<openshift-223.lab.eng.nay.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<openshift-223.lab.eng.nay.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt openshift-223.lab.eng.nay.redhat.com '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/oc_obj.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1489033498.92-73057137625639/" > /dev/null 2>&1 && sleep 0'"'"''
ok: [openshift-223.lab.eng.nay.redhat.com] => {
    "changed": false, 
    "invocation": {
        "module_args": {
            "all_namespaces": true, 
            "content": null, 
            "debug": false, 
            "delete_after": false, 
            "files": null, 
            "force": false, 
            "kind": "petsets", 
            "kubeconfig": "/etc/origin/master/admin.kubeconfig", 
            "name": null, 
            "namespace": "default", 
            "selector": null, 
            "state": "list"
        }, 
        "module_name": "oc_obj"
    }, 
    "results": {
        "cmd": "/usr/local/bin/oc get petsets -o json --all-namespaces", 
        "results": [
            {
                "apiVersion": "v1", 
                "items": [
                    {
                        "apiVersion": "apps/v1alpha1", 
                        "kind": "PetSet", 
                        "metadata": {
                            "creationTimestamp": "2017-03-09T03:29:45Z", 
                            "generation": 1, 
                            "labels": {
                                "app": "hello-pod"
                            }, 
                            "name": "hello-petset", 
                            "namespace": "default", 
                            "resourceVersion": "5629", 
                            "selfLink": "/apis/apps/v1alpha1/namespaces/default/petsets/hello-petset", 
                            "uid": "a42a35a6-0478-11e7-932f-fa163e30eba3"
                        }, 
                        "spec": {
                            "replicas": 2, 
                            "selector": {
                                "matchLabels": {
                                    "app": "hello-pod"
                                }
                            }, 
                            "serviceName": "foo", 
                            "template": {
                                "metadata": {
                                    "annotations": {
                                        "pod.alpha.kubernetes.io/initialized": "true"
                                    }, 
                                    "creationTimestamp": null, 
                                    "labels": {
                                        "app": "hello-pod"
                                    }
                                }, 
                                "spec": {
                                    "containers": [
                                        {
                                            "image": "openshift/hello-openshift:latest", 
                                            "imagePullPolicy": "IfNotPresent", 
                                            "name": "hello-pod", 
                                            "ports": [
                                                {
                                                    "containerPort": 8080, 
                                                    "protocol": "TCP"
                                                }
                                            ], 
                                            "resources": {}, 
                                            "securityContext": {
                                                "capabilities": {}, 
                                                "privileged": false
                                            }, 
                                            "terminationMessagePath": "/dev/termination-log", 
                                            "volumeMounts": [
                                                {
                                                    "mountPath": "/tmp", 
                                                    "name": "tmp"
                                                }
                                            ]
                                        }
                                    ], 
                                    "dnsPolicy": "ClusterFirst", 
                                    "restartPolicy": "Always", 
                                    "securityContext": {}, 
                                    "terminationGracePeriodSeconds": 0, 
                                    "volumes": [
                                        {
                                            "emptyDir": {}, 
                                            "name": "tmp"
                                        }
                                    ]
                                }
                            }
                        }, 
                        "status": {
                            "replicas": 2
                        }
                    }
                ], 
                "kind": "List", 
                "metadata": {}
            }
        ], 
        "returncode": 0
    }, 
    "state": "list"
}

TASK [FAIL ON Resource migration 'PetSets' unsupported] ************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/v3_5/validator.yml:38
skipping: [openshift-223.lab.eng.nay.redhat.com] => {
    "changed": false, 
    "skip_reason": "Conditional check failed", 
    "skipped": true
}
Comment 10 Scott Dodson 2017-03-13 14:09:46 EDT
https://github.com/openshift/openshift-ansible/pull/3638 Backported to release-1.5

The task name referenced in comment 9 no longer exists so I think we should test with a new build.
Comment 11 Scott Dodson 2017-03-13 14:11:58 EDT
fixed in openshift-ansible-3.5.30-1
Comment 12 Anping Li 2017-03-13 23:27:27 EDT
TASK [Fail on unsupported resource migration 'PetSets'] ************************
fatal: [openshift-224.lab.eng.nay.redhat.com]: FAILED! => {
    "changed": false, 
    "failed": true
}

MSG:

PetSet objects were detected in your cluster. These are an Alpha feature in upstream Kubernetes 1.4 and are not supported by Red Hat. In Kubernetes 1.5, they are replaced by the Beta feature StatefulSets. Red Hat currently does not offer support for either PetSets or StatefulSets.
Automatically migrating PetSets to StatefulSets in OpenShift Container Platform (OCP) 3.5 is not supported. See the Kubernetes "Upgrading from PetSets to StatefulSets" documentation for additional information:
https://kubernetes.io/docs/tasks/manage-stateful-set/upgrade-pet-set-to-stateful-set/
PetSets MUST be removed before upgrading to OCP 3.5. Red Hat strongly recommends reading the above referenced documentation in its entirety before taking any destructive actions.
If you want to simply remove all PetSets without manually migrating to StatefulSets, run this command as a user with cluster-admin privileges:
$ oc get petsets --all-namespaces -o yaml | oc delete -f - --cascade=false

	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.retry

PLAY RECAP *********************************************************************
localhost                  : ok=10   changed=0    unreachable=0    failed=0   
openshift-223.lab.eng.nay.redhat.com : ok=103  changed=9    unreachable=0    failed=0   
openshift-224.lab.eng.nay.redhat.com : ok=131  changed=10   unreachable=0    failed=1
Comment 14 errata-xmlrpc 2017-04-12 15:03:02 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903

Note You need to log in before you can comment on or make changes to this bug.