Bug 1334034 - OSD creation failure with physical disks
Summary: OSD creation failure with physical disks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Storage Console
Classification: Red Hat
Component: ceph-ansible
Version: 2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 2
Assignee: Alfredo Deza
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-07 11:31 UTC by Nishanth Thomas
Modified: 2016-08-23 19:50 UTC (History)
9 users (show)

Fixed In Version: ceph-ansible-1.0.5-8.el7scon
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-23 19:50:09 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github ceph ceph-ansible issues 759 None None None Never
Red Hat Product Errata RHEA-2016:1754 normal SHIPPED_LIVE New packages: Red Hat Storage Console 2.0 2017-04-18 19:09:06 UTC

Description Nishanth Thomas 2016-05-07 11:31:48 UTC
OSD creation failed for disks with the below error:

TASK: [ceph-osd | fix partitions gpt header or labels of the osd disks] ******* 
failed: [dhcp-126-123.lab.eng.brq.redhat.com] => (item=[{'changed': False, 'end': '2016-05-05 09:45:15.887391', 'failed': False, 'stdout': u'', 'cmd': 'parted --script /dev/vde print > /dev/null 2>&1', 'rc': 1, 'start': '2016-05-05 09:45:15.876054', 'item': u'/dev/vde', 'warnings': [], 'delta': '0:00:00.011337', 'invocation': {'module_name': u'shell', 'module_complex_args': {}, 'module_args': u'parted --script /dev/vde print > /dev/null 2>&1'}, 'stdout_lines': [], 'failed_when_result': False, 'stderr': u''}, u'/dev/vde']) => {"changed": false, "cmd": "sgdisk --zap-all --clear --mbrtogpt -g -- /dev/vde", "delta": "0:00:01.245651", "end": "2016-05-05 09:45:19.166575", "item": [{"changed": false, "cmd": "parted --script /dev/vde print > /dev/null 2>&1", "delta": "0:00:00.011337", "end": "2016-05-05 09:45:15.887391", "failed": false, "failed_when_result": false, "invocation": {"module_args": "parted --script /dev/vde print > /dev/null 2>&1", "module_complex_args": {}, "module_name": "shell"}, "item": "/dev/vde", "rc": 1, "start": "2016-05-05 09:45:15.876054", "stderr": "", "stdout": "", "stdout_lines": [], "warnings": []}, "/dev/vde"], "rc": 3, "start": "2016-05-05 09:45:17.920924", "warnings": []}
stderr: Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!

Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!

Invalid partition data!
stdout: Caution! After loading partitions, the CRC doesn't check out!
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Information: Creating fresh partition table; will override earlier problems!
Non-GPT disk; not saving changes. Use -g to override.

FATAL: all hosts have already failed -- aborting

================================
A bunch of OSDs failed with the error and all of then were physiscal disks

Comment 2 Alfredo Deza 2016-05-09 10:46:54 UTC
What happens if you re-run the OSD configure for that server? (I added the upstream ticket for ceph-ansible)

Comment 3 Alfredo Deza 2016-05-09 15:57:57 UTC
Pull request merged upstream https://github.com/ceph/ceph-ansible/pull/766

Comment 4 Nishanth Thomas 2016-05-10 05:05:56 UTC
Daniel,

Can you provide this info?

Comment 5 Daniel Horák 2016-05-10 07:22:15 UTC
Nishanth,

I'm not sure, how I can check it? Is it possible to relaunch the OSD configuration from USM web UI or API?

Comment 6 monti lawrence 2016-05-10 20:08:20 UTC
Alfredo will get changes downstream.

Comment 10 Daniel Horák 2016-08-02 15:29:38 UTC
I've retested the initial scenario with not properly cleaned disks and it created all the expected OSDs.

Tested on:
USM Server/ceph-installer server (RHEL 7.2):
  ceph-ansible-1.0.5-31.el7scon.noarch
  ceph-installer-1.0.14-1.el7scon.noarch
  rhscon-ceph-0.0.39-1.el7scon.x86_64
  rhscon-core-0.0.39-1.el7scon.x86_64
  rhscon-core-selinux-0.0.39-1.el7scon.noarch
  rhscon-ui-0.0.51-1.el7scon.noarch
  salt-2015.5.5-1.el7.noarch
  salt-master-2015.5.5-1.el7.noarch
  salt-selinux-0.0.39-1.el7scon.noarch

  Ceph node (RHEL 7.2):
  ceph-base-10.2.2-32.el7cp.x86_64
  ceph-common-10.2.2-32.el7cp.x86_64
  ceph-osd-10.2.2-32.el7cp.x86_64
  ceph-selinux-10.2.2-32.el7cp.x86_64
  libcephfs1-10.2.2-32.el7cp.x86_64
  python-cephfs-10.2.2-32.el7cp.x86_64
  rhscon-agent-0.0.16-1.el7scon.noarch
  rhscon-core-selinux-0.0.39-1.el7scon.noarch
  salt-2015.5.5-1.el7.noarch
  salt-minion-2015.5.5-1.el7.noarch
  salt-selinux-0.0.39-1.el7scon.noarch

>> VERIFIED

Comment 12 errata-xmlrpc 2016-08-23 19:50:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2016:1754


Note You need to log in before you can comment on or make changes to this bug.