Bug 1336571

Summary: ceph-mds not installed correctly by ceph-ansible
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Ben England <bengland>
Component: ceph-ansibleAssignee: Andrew Schoen <aschoen>
Status: CLOSED ERRATA QA Contact: sds-qe-bugs
Severity: medium Docs Contact:
Priority: high    
Version: 2CC: adeza, aschoen, bmarson, ceph-eng-bugs, ceph-qe-bugs, gfarnum, hnallurv, kdreyer, kurs, nthomas, sankarshan, seb, shan, vakulkar
Target Milestone: ---   
Target Release: 2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-ansible-1.0.5-13.el7scon Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:50:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben England 2016-05-16 22:41:03 UTC
Description of problem:

The ceph-mds server is not brought up correctly by ceph-ansible, and it's trivial to fix.

Version-Release number of selected component (if applicable):

RHEL7.2GA

ceph-ansible:
  http://puddle.ceph.redhat.com/puddles/rhscon/2/2016-05-12.1/RHSCON-2/x86_64/os/Packages/ceph-ansible-1.0.5-10.el7scon.noarch.rpm

RHCS-2 puddle:
  http://puddle.ceph.redhat.com/puddles/ceph/2/2016-05-06.1/CEPH-2.repo

How reproducible:

every time

Steps to Reproduce:
1.  install RHSCON-2 from RHCS 2.0 puddle as described in first doc below 

https://docs.google.com/document/d/1GzcpiciMLdNzZ46BLjVivKBVIhb1VDvzwZXAqX4Or9s

2.  configure [mdss] in ceph-ansible's inventory file

3.  run ceph-ansible


Actual results:

first sign of trouble is:

TASK: [ceph-mds | enable systemd unit file for mds instance (for or after infernalis)] ***
ok: [ceph-r730-01-10ge] => {"changed": false, "failed": false, "failed_when_result": false, "msg": "src file does not exist, use \"force=yes\" if you really want to create the link: /usr/lib/systemd/system/ceph-mds@.service", "path": "/etc/systemd/system/multi-user.target.wants/ceph-mds", "src": "/usr/lib/systemd/system/ceph-mds@.service", "state": "absent"}

but the hard failure is at:

TASK: [ceph-mds | start and add that the metadata service to the init sequence (systemd after hammer)] ***
failed: [ceph-r730-01-10ge] => {"changed": false, "failed": true}
msg: Error when trying to enable ceph-mds@ceph-r730-01-10ge: rc=1 Failed to execute operation: No such file or directory

Expected results:

ceph-mds should be installed, set up to start on boot, and started

Additional info:

The root causes are:

- ceph-mds RPM never was installed
- ceph-ansible is doing something it doesn't have to do with linking /etc/systemd/ceph-mds@.service to /usr/lib/systemd/system/ceph-mds@.service

instead the task "start and add that the metadata service to the init sequence (systemd after hammer)" should take care of everything and does the equivalent of:

systemctl enable ceph-mds
systemctl start ceph-mds

The following patch is at least my attempt to fix this, need other eyes on it.

diff -ur /usr/share/ceph-ansible/roles/ceph-common/tasks/installs/install_on_redhat.yml ceph-ansible/roles/ceph-common/tasks/installs/install_on_redhat.yml
--- /usr/share/ceph-ansible/roles/ceph-common/tasks/installs/install_on_redhat.yml	2016-05-10 17:43:34.000000000 -0400
+++ ceph-ansible/roles/ceph-common/tasks/installs/install_on_redhat.yml	2016-05-16 17:57:16.109154142 -0400
@@ -134,3 +134,20 @@
   when:
     rgw_group_name in group_names and
     ansible_pkg_mgr == "dnf"
+
+- name: install mds for yum
+  yum:
+    name: ceph-mds
+    state: "{{ (upgrade_ceph_packages|bool) | ternary('latest','present') }}"
+  when:
+    mds_group_name in group_names and
+    ansible_pkg_mgr == "yum"
+
+- name: install mds for DNF
+  dnf:
+    name: ceph-mds
+    state: "{{ (upgrade_ceph_packages|bool) | ternary('latest','present') }}"
+  when:
+    mds_group_name in group_names and
+    ansible_pkg_mgr == "dnf"
+
diff -ur /usr/share/ceph-ansible/roles/ceph-mds/tasks/pre_requisite.yml ceph-ansible/roles/ceph-mds/tasks/pre_requisite.yml
--- /usr/share/ceph-ansible/roles/ceph-mds/tasks/pre_requisite.yml	2016-05-10 17:43:35.000000000 -0400
+++ ceph-ansible/roles/ceph-mds/tasks/pre_requisite.yml	2016-05-16 18:01:32.943576869 -0400
@@ -70,17 +70,6 @@
   changed_when: false
   when: not use_systemd
 
-- name: enable systemd unit file for mds instance (for or after infernalis)
-  file:
-    src: /usr/lib/systemd/system/ceph-mds@.service
-    dest: /etc/systemd/system/multi-user.target.wants/ceph-mds@{{ mds_name }}.service
-    state: link
-  changed_when: false
-  failed_when: false
-  when:
-    use_systemd and
-    is_after_hammer
-
 - name: start and add that the metadata service to the init sequence (upstart)
   command: initctl emit ceph-mds cluster={{ cluster }} id={{ mds_name }}
   changed_when: false

Comment 2 seb 2016-05-17 11:32:58 UTC
I think this is fixed upstream now.
Can you confirm?

Comment 3 Ken Dreyer (Red Hat) 2016-05-17 15:41:59 UTC
Andrew do we have all the patches we need downstream for this?

Comment 11 Ken Dreyer (Red Hat) 2016-05-18 20:31:29 UTC
*** Bug 1335314 has been marked as a duplicate of this bug. ***

Comment 12 Vasu Kulkarni 2016-05-18 20:35:59 UTC
Verified this on latest rhscon build and not seeing the issue anymore.

ceph-ansible 1.0.5-13.el7scon

Comment 13 Harish NV Rao 2016-05-30 11:19:47 UTC
Moving to verified state based on comment 12

Comment 15 errata-xmlrpc 2016-08-23 19:50:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2016:1754