Bug 1292577

Summary: ceph-deploy mon create-initial fails
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Jim Curtis <jim.curtis>
Component: Ceph-InstallerAssignee: Alfredo Deza <adeza>
Status: CLOSED INSUFFICIENT_DATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.3.2CC: adeza, aschoen, ceph-eng-bugs, dbajot, gmeno, HassanHashemi, jim.curtis, jss, kdreyer, nthomas, sankarshan
Target Milestone: rc   
Target Release: 1.3.4   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-24 17:31:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jim Curtis 2015-12-17 20:18:07 UTC
Description of problem:

I followed the quick ceph deploy steps to setup 4 nodes as described at:

http://docs.ceph.com/docs/master/start/quick-ceph-deploy/

I was doing the Storage Cluster Quick Start after completing the Preflight and I was doing step #5.  I executed ceph-deploy mon create-initial and it failed with the following:

[2015-12-17 05:27:20,306][jicurtis-ceph-node1-test][DEBUG ] create a done file to avoid re-doing the mon deployment
[2015-12-17 05:27:20,310][jicurtis-ceph-node1-test][DEBUG ] create the init path if it does not exist
[2015-12-17 05:27:20,322][jicurtis-ceph-node1-test][INFO  ] Running command: sudo systemctl enable ceph.target
[2015-12-17 05:27:20,518][jicurtis-ceph-node1-test][INFO  ] Running command: sudo systemctl enable ceph-mon@jicurtis-ceph-node1-test
[2015-12-17 05:27:20,566][jicurtis-ceph-node1-test][WARNING] Failed to issue method call: No such file or directory
[2015-12-17 05:27:20,568][jicurtis-ceph-node1-test][ERROR ] RuntimeError: command returned non-zero exit status: 1
[2015-12-17 05:27:20,569][ceph_deploy.mon][ERROR ] Failed to execute command: systemctl enable ceph-mon@jicurtis-ceph-node1-test
[2015-12-17 05:27:20,569][ceph_deploy][ERROR ] GenericError: Failed to create 1 monitors



Version-Release number of selected component (if applicable):
ceph version is 9.2.0
ceph-deploy version is 1.5.30

How reproducible:

It seems easy.

Steps to Reproduce:
1. Complete all step in Preflight at http://docs.ceph.com/docs/master/start/quick-start-preflight/
2. Proceed to Storage Cluster Quick Start at http://docs.ceph.com/docs/master/start/quick-ceph-deploy/
3. At step #5, ceph-deploy mon create-initial will fail.

Actual results:

[2015-12-17 05:27:20,306][jicurtis-ceph-node1-test][DEBUG ] create a done file to avoid re-doing the mon deployment
[2015-12-17 05:27:20,310][jicurtis-ceph-node1-test][DEBUG ] create the init path if it does not exist
[2015-12-17 05:27:20,322][jicurtis-ceph-node1-test][INFO  ] Running command: sudo systemctl enable ceph.target
[2015-12-17 05:27:20,518][jicurtis-ceph-node1-test][INFO  ] Running command: sudo systemctl enable ceph-mon@jicurtis-ceph-node1-test
[2015-12-17 05:27:20,566][jicurtis-ceph-node1-test][WARNING] Failed to issue method call: No such file or directory
[2015-12-17 05:27:20,568][jicurtis-ceph-node1-test][ERROR ] RuntimeError: command returned non-zero exit status: 1
[2015-12-17 05:27:20,569][ceph_deploy.mon][ERROR ] Failed to execute command: systemctl enable ceph-mon@jicurtis-ceph-node1-test
[2015-12-17 05:27:20,569][ceph_deploy][ERROR ] GenericError: Failed to create 1 monitors


Expected results:

success

Additional info:

This problem looks like it was encountered by another user and reported on the ceph-users list:

http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2015-December/006493.html

Comment 2 David Bajot 2016-03-25 04:29:47 UTC
I encountered the same issue while following the quick deploy steps of the master install doc (http://docs.ceph.com/docs/master/start/quick-ceph-deploy/) to deploy Ceph (hammer) on RHEL 7 nodes:

[cephadmin@ceph1 ceph-lab-cluster]$ sudo -E ceph-deploy --overwrite-conf mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.25): /bin/ceph-deploy --overwrite-conf mon create-initial
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph_deploy.mon][INFO  ] distro info: Red Hat Enterprise Linux Server 7.2 Maipo
[ceph1][DEBUG ] determining if provided host has same hostname in remote
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] deploying mon to ceph1
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] remote hostname: ceph1
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph1][DEBUG ] create the mon path if it does not exist
[ceph1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph1/done
[ceph1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph1][DEBUG ] create the init path if it does not exist
[ceph1][DEBUG ] locating the `service` executable...
[ceph1][INFO  ] Running command: /usr/sbin/service ceph -c /etc/ceph/ceph.conf start mon.ceph1
[ceph1][WARNIN] The service command supports only basic LSB actions (start, stop, restart, try-restart, reload, force-reload, status). For other actions, plse try to use systemctl.
[ceph1][ERROR ] RuntimeError: command returned non-zero exit status: 2
[ceph_deploy.mon][ERROR ] Failed to execute command: /usr/sbin/service ceph -c /etc/ceph/ceph.conf start mon.ceph1
[ceph_deploy][ERROR ] GenericError: Failed to create 1 monitors

Comment 3 Alfredo Deza 2016-05-06 11:12:42 UTC
Can you try with the latest ceph-deploy (1.5.32 as of this writing).

This should've been corrected.

Comment 4 John 2017-11-20 14:18:39 UTC
NEARLY 2018 AND STILL PRESENT USING LATEST STABLE RELEASE OF CEPH ON LATEST RELEASE OF CENTOS/RHEL

WHAT MORE INFORMATION DO YOU NEED

------------------------------------------------------------------
[voltaire][DEBUG ] locating the `service` executable...
[voltaire][INFO  ] Running command: sudo /usr/sbin/service ceph -c /etc/ceph/ceph.conf start mon.voltaire
[voltaire][WARNIN] The service command supports only basic LSB actions (start, stop, restart, try-restart, reload, force-reload, status). For other actions, please try to use systemctl.
[voltaire][ERROR ] RuntimeError: command returned non-zero exit status: 2
[ceph_deploy.mon][ERROR ] Failed to execute command: /usr/sbin/service ceph -c /etc/ceph/ceph.conf start mon.voltaire
[ceph_deploy][ERROR ] GenericError: Failed to create 1 monitors
-------------------------------------------------------------------

[root@voltaire ~]# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core) 
[root@voltaire ~]# rpm -qi ceph-deploy
Name        : ceph-deploy
Version     : 1.5.25
Release     : 1.el7
Architecture: noarch
Install Date: Mon 20 Nov 2017 23:53:38 ACDT
Group       : Unspecified
Size        : 546673
License     : MIT
Signature   : RSA/SHA256, Sat 30 May 2015 02:47:29 ACST, Key ID 6a2faea2352c64e5
Source RPM  : ceph-deploy-1.5.25-1.el7.src.rpm
Build Date  : Wed 27 May 2015 04:45:52 ACST


PLS REBUILD AND UPDATE EPEL WITH NEW CEPH-DEPLOY THERE'S NO POINT CONTINUING TO DEVELOP A TOOL IF PEOPLE ARE STUCK USING GARBAGE THAT'S NEARLY 3 YEARS OLD.

Comment 5 John 2017-11-20 14:38:17 UTC
OK, here is the problem:

http://mirror.centos.org/centos/7.4.1708/storage/x86_64/ceph-luminous/

The CentOS mirrors for luminous, and jewel releases do not contain ANY version of ceph-deploy, so installing it defaults to ancient version in epel, even if you have ceph-luminous release repo installed and enabled.

The repo for hammer release contains ceph-deplpy 1.5.31:
http://mirror.centos.org/centos/7.4.1708/storage/x86_64/ceph-hammer/ceph-deploy-1.5.31-1.el7.noarch.rpm
So that is at least an improvement on the ancient 1.5.25 that's in epel.

I thought CentOS was supposed to be tracking RHEL... so, if RHEL has made this omission after recent releases (jewel and luminous) then that is pitiful.

Comment 6 John 2017-11-20 14:42:16 UTC
It is a pain in the rear to have to go here, to find the latest ceph-deploy:
http://download.ceph.com/rpm-luminous/el7/noarch/

But at least it is there.

Two thngs need to happen:

1) remove ceph-deploy from epel. It should not be there.
2) update centos/rhel ceph-luminous & ceph-jewel repositories.

Ta.

Comment 7 Ken Dreyer (Red Hat) 2017-11-20 16:28:45 UTC
Hi John,

For #1, would you please file a BZ against the "ceph-deploy" component in the "Fedora EPEL" project? This particular BZ 1292577 is for the downstream RH Ceph Storage product and Fedora's EPEL is an entirely separate thing.

For #2, the ceph-luminous and ceph-jewel repositories in the CentOS Storage SIG are community projects currently run by volunteers. https://lists.centos.org/mailman/listinfo/centos-devel

The CentOS Storage SIG Ceph repos do not currently match up with the downstream RH Ceph Storage product. The two are maintained by almost entirely separate groups of people. They are not connected in the way that CentOS itself and RHEL Base are connected, for example, or RDO and RHEL OSP, etc.

Eventually we need to get more alignment there, but that's not the reality today. (If you have a support contract with RH it would be good to pass this feedback up through your RH support representatives.)

Comment 8 John 2017-11-21 11:41:31 UTC
Hi Ken,

Aaah thanbks for that info. That makes a lot of sense. I'll get onto the fedora epel & lodge a bug there shortly.

Cheers,
John

Comment 9 Hassan Hashemi 2018-07-18 08:28:05 UTC
having same issue on Ubuntu 17, at 2018.
i noticed that Ceph-deploy package that was originally installed by following same docs was old and so manually downloaded the installer from here:
https://download.ceph.com/debian-luminous/pool/main/c/ceph-deploy/
but still getting same error:

[node1][INFO  ] Running command: sudo systemctl enable ceph-mds@node1
[node1][WARNIN] Failed to enable unit: Unit file ceph-mds does not exist.
[node1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.mds][ERROR ] Failed to execute command: systemctl enable ceph-mds@node1
[ceph_deploy][ERROR ] GenericError: Failed to create 1 MDSs