1668335 – Failed to deploy OCS3.10 on OCP3.11.z, lvm commands fail

Bug 1668335 - Failed to deploy OCS3.10 on OCP3.11.z, lvm commands fail

Summary: Failed to deploy OCS3.10 on OCP3.11.z, lvm commands fail

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	cns-ansible
Sub Component:
Version:	cns-3.10
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	OCS 3.11.z Batch Update 2
Assignee:	Jose A. Rivera
QA Contact:	Prasanth
Docs Contact:
URL:
Whiteboard:
Depends On:	1669080
Blocks:	1669979
TreeView+	depends on / blocked

Reported:	2019-01-22 13:23 UTC by Apeksha
Modified:	2019-03-27 06:44 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Previously, some versions of the openshift-ansible installer shipped with the templates were not backwards compatible with all versions of the rhgs-server container. This prevented processes within the container from accessing devices in /dev, leading to issues with LVM and block devices. With this fix, templates in openshift-ansible now correctly adapt to the version of the OCS container being deployed. Deploying different versions of OCS with up-to-date versions of openshift-ansible will succeed as expected.
Clone Of:
Environment:
Last Closed:	2019-03-27 06:44:36 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0670	0	None	None	None	2019-03-27 06:44:39 UTC

Description Apeksha 2019-01-22 13:23:24 UTC

Description of problem:
Failed to deploy OCS3.10 on OCP3.11.z, lvm commands fail

Version-Release number of selected component (if applicable):
atomic-openshift-3.11.69-1.git.0.7478b86.el7.x86_64
heketi-client-7.0.0-15.el7rhgs.x86_64
rhgs3/rhgs-server-rhel7:v3.10
openshift-ansible-3.11.70-1

How reproducible: Twice


Steps to Reproduce:
CI for OCP3.11.z+OCS3.10 failed twice

Fails at the step:
TASK [openshift_storage_glusterfs : Create heketi DB volume]

fatal: [apu1-v311z-ocs-v310-master-0]: FAILED! => {
    "changed": true, 
    "cmd": [
        "oc", 
        "--config=/tmp/openshift-glusterfs-ansible-fHBHOU/admin.kubeconfig", 
        "rsh", 
        "--namespace=storage", 
        "deploy-heketi-storage-1-ccdqs", 
        "heketi-cli", 
        "-s", 
        "http://localhost:8080", 
        "--user", 
        "admin", 
        "--secret", 
        "admin", 
        "setup-openshift-heketi-storage", 
        "--image", 
        "rhgs3/rhgs-volmanager-rhel7:v3.10", 
        "--listfile", 
        "/tmp/heketi-storage.json"
    ], 
    "delta": "0:01:02.884930", 
    "end": "2019-01-22 12:23:45.129174", 
    "invocation": {
        "module_args": {
            "_raw_params": "oc --config=/tmp/openshift-glusterfs-ansible-fHBHOU/admin.kubeconfig rsh --namespace=storage deploy-heketi-storage-1-ccdqs heketi-cli -s http://localhost:8080 --user admin  --secret 'admin' setup-openshift-heketi-storage --image rhgs3/rhgs-volmanager-rhel7:v3.10 --listfile /tmp/heketi-storage.json", 
            "_uses_shell": false, 
            "argv": null, 
            "chdir": null, 
            "creates": null, 
            "executable": null, 
            "removes": null, 
            "stdin": null, 
            "warn": true
        }
    }, 
    "msg": "non-zero return code", 
    "rc": 255, 
    "start": "2019-01-22 12:22:42.244244", 
    "stderr": "Error: WARNING: This metadata update is NOT backed up.\n  /dev/vg_cd2fa2da8c597b187b27ef1e86fc1036/lvol0: not found: device not cleared\n  Aborting. Failed to wipe start of new LV.\ncommand terminated with exit code 255", 
    "stderr_lines": [
        "Error: WARNING: This metadata update is NOT backed up.", 
        "  /dev/vg_cd2fa2da8c597b187b27ef1e86fc1036/lvol0: not found: device not cleared", 
        "  Aborting. Failed to wipe start of new LV.", 
        "command terminated with exit code 255"
    ], 
    "stdout": "", 
    "stdout_lines": []
}

Expected results:
Install succeeds.

Comment 2 Apeksha 2019-01-22 13:59:14 UTC

The ocs anisble log file, the output of the central CI jenkins job and sosreports from all nodes available here - http://rhsqe-repo.lab.eng.blr.redhat.com/cns/bugs/BZ-1668335/

Comment 3 Niels de Vos 2019-01-24 10:47:58 UTC

Is there an openshift-ansible bug for this?

https://github.com/openshift/openshift-ansible/pull/11068 should address the problem.

Comment 14 Anjana KD 2019-03-19 06:52:25 UTC

Hello Niels and John,

Could you please provide a bug fix doc text (CCFR--> Format) and change the doctype too.

Comment 21 errata-xmlrpc 2019-03-27 06:44:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0670

Note You need to log in before you can comment on or make changes to this bug.