Bug 1455485

Summary: Fail to upgrade ocp due to v3 data checking for embedded etcd env
Product: OpenShift Container Platform Reporter: liujia <jiajliu>
Component: Cluster Version OperatorAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED ERRATA QA Contact: liujia <jiajliu>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.6.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-ansible-3.6.100-1.git.0.08e52a1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 05:25:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description liujia 2017-05-25 10:07:25 UTC
Description of problem:
Upgrade failed at task [etcd_upgrade : Generate etcd backup] for an unexisted file diretory. When use embedded etcd, etcd data-dir should be /var/lib/origin/openshift.local.etcd/.
 
fatal: [x.x.x.x]: FAILED! => {
    "changed": true,
    "cmd": [
        "etcdctl",
        "backup",
        "--data-dir=/var/lib/etcd/",
        "--backup-dir=/var/lib/etcd//openshift-backup-etcd_backup_tag20170525051038"
    ],
    "delta": "0:00:00.015679",
    "end": "2017-05-25 05:17:06.055373",
    "failed": true,
    "invocation": {
        "module_args": {
            "_raw_params": "etcdctl backup --data-dir=/var/lib/etcd/ --backup-dir=/var/lib/etcd//openshift-backup-etcd_backup_tag20170525051038",
            "_uses_shell": false,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "warn": true
        },
        "module_name": "command"
    },
    "rc": 1,
    "start": "2017-05-25 05:17:06.039694",
    "warnings": []
}

STDERR:

2017-05-25 05:17:06.054235 I | open /var/lib/etcd/member/snap: no such file or directory
        to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_6/upgrade.retry


Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.6.80-1.git.0.807fc98.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. install ocp3.5 (all in one with embedded etcd)
2. upgrade 3.5 to 3.6
3.

Actual results:
Upgrade failed.

Expected results:
Upgrade succeed.

Additional info:
There is another error in logs about dir '/var/lib/etcd/’
1) for TASK [etcd_upgrade : Check available disk space for etcd backup] 
STDERR:

df: ‘/var/lib/etcd/’: No such file or directory

2) for TASK [etcd_upgrade : Check current etcd disk usage]
STDERR:

du: cannot access ‘/var/lib/etcd/’: No such file or directory


# ls -la /var/lib/ | grep etcd
drwxr-xr-x.  3 etcd    etcd      60 May 25 05:17 etcd

# df --output=avail -k /var/lib/etcd/ | tail -n 1
8072904

Comment 1 Jan Chaloupka 2017-06-09 13:33:35 UTC
Upstream PR: https://github.com/openshift/openshift-ansible/pull/4401

Comment 2 Jan Chaloupka 2017-06-12 08:09:27 UTC
Merged upstream -> switching to MODIFIED

Comment 4 liujia 2017-06-23 05:21:56 UTC
version:
atomic-openshift-utils-3.6.121-1.git.0.ed0b72c.el7.noarch

1. install ocp3.5 (all in one with embedded etcd)
2. upgrade 3.5 to 3.6

Upgrade successfully.

Comment 6 errata-xmlrpc 2017-08-10 05:25:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716