Bug 1356005 - [Ubuntu] calamari-lite is not running on any monitor
Summary: [Ubuntu] calamari-lite is not running on any monitor
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Calamari
Version: 2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 2.0
Assignee: Christina Meno
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On:
Blocks: Console-2-DevFreeze
TreeView+ depends on / blocked
 
Reported: 2016-07-13 09:04 UTC by Daniel Horák
Modified: 2016-08-23 19:44 UTC (History)
7 users (show)

Fixed In Version: RHEL: calamari-server-1.4.6-1.el7cp Ubuntu: calamari-server_1.4.6-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-23 19:44:15 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1755 normal SHIPPED_LIVE Red Hat Ceph Storage 2.0 bug fix and enhancement update 2016-08-23 23:23:52 UTC

Description Daniel Horák 2016-07-13 09:04:38 UTC
Description of problem:
  On cluster created from Ubuntu nodes is not calamari-lite properly started on any monitor.
  
  It might be problem with supervisor.service, because it is also not running and is disabled.

Version-Release number of selected component (if applicable):
  USM server (RHEL 7.2):
  ceph-ansible-1.0.5-25.el7scon.noarch
  ceph-installer-1.0.12-4.el7scon.noarch
  rhscon-ceph-0.0.32-1.el7scon.x86_64
  rhscon-core-0.0.33-1.el7scon.x86_64
  rhscon-core-selinux-0.0.33-1.el7scon.noarch
  rhscon-ui-0.0.47-1.el7scon.noarch

  Ceph MON (Ubuntu 16.04):
  calamari-server 1.4.5-2redhat1xenial
  ceph-base       10.2.2-16redhat1xenial
  ceph-common     10.2.2-16redhat1xenial
  ceph-mon        10.2.2-16redhat1xenial
  libcephfs1      10.2.2-16redhat1xenial
  python-cephfs   10.2.2-16redhat1xenial
  rhscon-agent    0.0.14-2redhat1xenial
  
How reproducible:
  100%

Steps to Reproduce:
1. Prepare bunch of nodes (one RHEL 7.2 and at least 5 Ubuntu 16.04).
2. Install and configure USM server on RHEL node and configure rhscon-agents on Ubuntu nodes.
3. Create Ceph cluster via USM web UI.
4. Check if calamari-lite is running on some ceph MON node.
  # supervisorctl status calamari-lite
  # systemctl status supervisor.service 
  
Actual results:
  calamari-lite (and also supervisor.service) is not running, supervisor.service is not enabled to start after machine reboot.
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # supervisorctl status calamari-lite
    unix:///var/run/supervisor.sock no such file

  # systemctl status supervisor.service 
    ● supervisor.service - Supervisor process control system for UNIX
       Loaded: loaded (/lib/systemd/system/supervisor.service; disabled; vendor preset: enabled)
       Active: inactive (dead)
         Docs: http://supervisord.org
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expected results:
  calamari-lite will be properly configured and running on one ceph MON as it is required and als it will be configured to automatically start after machine reboot.

Additional info:
  I'm not 100% sure, who is responsible for configuring and starting calamari and related services, so if it is problem for example with ceph-installer or ceph-ansible, please reassign this bug to proper component.

  It might be related to Bug 1305259.

Comment 1 Daniel Horák 2016-07-13 09:15:26 UTC
Just a note: I also noticed, that supervisor.service is called differently on RHEL and on Ubuntu.
On RHEL it is supervisord.service, but on Ubuntu it is only supervisor.service.

Comment 2 Nishanth Thomas 2016-07-13 13:31:15 UTC
yes Daniel, that is root cause of this issue. 

https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/calamari_ctl.py#L260 tries to start supervisord(as specified in /opt/calamari/salt-local/services.sls) rather it should be supervisor in ubuntu

Comment 5 Christina Meno 2016-07-14 20:44:33 UTC
https://github.com/ceph/calamari/releases/tag/v1.4.6

Comment 8 Daniel Horák 2016-07-28 07:44:00 UTC
Tested on:
USM Server (RHEL 7.2):
  ceph-ansible-1.0.5-31.el7scon.noarch
  ceph-installer-1.0.14-1.el7scon.noarch
  rhscon-ceph-0.0.36-1.el7scon.x86_64
  rhscon-core-0.0.36-1.el7scon.x86_64
  rhscon-core-selinux-0.0.36-1.el7scon.noarch
  rhscon-ui-0.0.50-1.el7scon.noarch

Ceph MON (Ubuntu 16.04):
  ii  calamari-server 1.4.7-2redhat1xenial    amd64  Inktank package containing the Calamari management server
  ii  ceph-base       10.2.2-23redhat1xenial  amd64  common ceph daemon libraries and management tools
  ii  ceph-common     10.2.2-23redhat1xenial  amd64  common utilities to mount and interact with a ceph storage cluster
  ii  ceph-mon        10.2.2-23redhat1xenial  amd64  monitor server for the ceph storage system
  ii  libcephfs1      10.2.2-23redhat1xenial  amd64  Ceph distributed file system client library
  ii  python-cephfs   10.2.2-23redhat1xenial  amd64  Python libraries for the Ceph libcephfs library
  ii  rhscon-agent    0.0.16-2redhat1xenial   all    SKYNET is the event agent for SKYRING. Each storage node managed

Service supervisor and calamari-lite is properly running on one Ceph MON.

>> VERIFIED

Comment 10 errata-xmlrpc 2016-08-23 19:44:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html


Note You need to log in before you can comment on or make changes to this bug.