Bug 1696717 - [RFE] deploy manila cephfs-with-NFS with an external ceph cluster
Summary: [RFE] deploy manila cephfs-with-NFS with an external ceph cluster
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z2
: 16.0 (Train on RHEL 8.1)
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
Laura Marsh
URL:
Whiteboard:
Depends On: 1710358 1801319 1802066 1814942 1819988 1822328 1831285 1831342
Blocks: 1766484 1843668
TreeView+ depends on / blocked
 
Reported: 2019-04-05 13:35 UTC by Tom Barron
Modified: 2023-12-15 16:25 UTC (History)
26 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.1-0.20191126041653.414d4d9.el8ost
Doc Type: Enhancement
Doc Text:
This feature enables the Red Hat OpenStack Platform director to deploy the Shared File System (manila) with an external Ceph Storage cluster. In this type of deployment, Ganesha still runs on the Controller nodes that Pacemaker manages using an active-passive configuration. This feature is supported with Ceph Storage 4.1 or later.
Clone Of:
Environment:
Last Closed: 2020-04-15 10:38:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
failed deployment logs (1.43 MB, application/gzip)
2020-02-10 15:19 UTC, Yogev Rabl
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 662221 0 'None' MERGED Allow for configuration of the Manila CephFS backend with a remote Ceph 2020-12-01 01:38:13 UTC

Description Tom Barron 2019-04-05 13:35:18 UTC
Description of problem: Support for deploying manila with CephFS-with-NFS (via ganesha gateway) was added in OSP13 but this was only for deployments where Director installs the Ceph daemons and the Ganesha daemon.  While this meets the needs of some of our customers, some would also like to be able to deploy manila such that it references an externally deployed Ceph cluster.

There are several possible ways that this need might be met.  For example there is ongoing work to deploy the ceph daemons and ganesha via rook and kubernetes, but we don't have a concrete timeline for that work and it would not be something we could backport.  Alternatively, we may be able to modify the current TripleO heat templates and ceph-ansible playbooks so that if an external cluster is available we can reference it instead of installing the daemons ourselves.  There are two variations of this last approach -- one where only the  ceph daemons are external and we still deploy ganesha, and one where ganesha is also external.

An important consideration for all these possibilities is that ganesha is in the data path for share service and cannot today run active-active.  That is why when we introduced support for Cephfs-via-NFS in OSP13 we ran ganesha on controller nodes as part of the pacemaker cluster there, and that need drove the choice to lead with support only for Director-integrated deployment of ganesha and ceph daemons.

So this work may split into three phases:

  1) see if we can keep pacemaker control of ganesha for service availability
     but allow the ceph daemons themselves to be externally deployed.

  2) for deployments that can manage ganesha availability themselves,
     allow ceph daemons and ganesha to be externally deployed.  This scenario
     would likely always involve a Support Exception so that Red Hat is
     not held accountable for failure in the data path.

  3) longer term, work with Storage BU on the rook based deployment of
     external ceph daemons and ganesha service, where even if Director
     triggers the deployment of the ceph-ganesha infrastructure at the
     same time that it deploys the overcloud, it is technically external
     to OpenStack itself and where the HA for ganesha service is no longer
     maintained by pacemaker cluster in OpenStack.

Comment 30 Yogev Rabl 2020-02-10 15:18:18 UTC
Deployment failed in version 
openstack-tripleo-heat-templates-11.3.2-0.20200131125640.cc909b6.el8ost.noarch
with the error: 

  "fatal: [controller-0]: FAILED! => ",
        "  msg: |-",
        "    The task includes an option with an undefined variable. The error was: 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'",
        "  ",
        "    The error appears to be in '/usr/share/ceph-ansible/roles/ceph-nfs/tasks/main.yml': line 29, column 3, but may",
        "    be elsewhere in the file depending on the exact syntax problem.",
        "    The offending line appears to be:",
        "    - name: copy rgw keyring when deploying internal ganesha with external ceph cluster",
        "      ^ here",

Comment 31 Yogev Rabl 2020-02-10 15:19:02 UTC
Created attachment 1662167 [details]
failed deployment logs

Comment 33 Lon Hohberger 2020-02-13 11:40:17 UTC
According to our records, this should be resolved by openstack-tripleo-heat-templates-11.3.2-0.20200131125640.cc909b6.el8ost.  This build is available now.

Comment 44 Yogev Rabl 2020-04-15 00:59:09 UTC
verified with all manila tests passed successfully


Note You need to log in before you can comment on or make changes to this bug.