Bug 1797075 - [OSP 16/RHCS 4.0] Can't create or revoke access to Manila shares - ceph-nfs-pacemaker is broken
Summary: [OSP 16/RHCS 4.0] Can't create or revoke access to Manila shares - ceph-nfs-p...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Container
Version: 4.0
Hardware: All
OS: All
high
high
Target Milestone: rc
: 4.1
Assignee: Dimitri Savineau
QA Contact: Yogev Rabl
Karen Norteman
URL:
Whiteboard:
Depends On:
Blocks: 1760354 1797047 1798514 1799098 1816167
TreeView+ depends on / blocked
 
Reported: 2020-01-31 20:41 UTC by Tom Barron
Modified: 2023-10-06 19:07 UTC (History)
31 users (show)

Fixed In Version: rhceph:ceph-4.0-rhel-8-containers-candidate-64223-20200206175240
Doc Type: Bug Fix
Doc Text:
.The `nfs-ganesha` daemon starts normally Previously, a configuration using `nfs-ganesha` with the RADOS backend would not start because the `nfs-ganesha-rados-urls` library was missing. This occurred because the `nfs-ganesha` library package for the RADOS backend was moved to a dedicated package. With this update, the `nfs-ganesha-rados-urls` package is added to the Ceph container image, so the `nfs-ganesha` daemon starts successfully.
Clone Of: 1797047
Environment:
Last Closed: 2020-06-03 16:22:16 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-container pull 1580 0 None closed ubi8: add nfs-ganesha-rados-urls package 2021-01-19 21:12:05 UTC
Github ceph ceph-container pull 1581 0 None closed ubi8: add nfs-ganesha-rados-urls package (bp #1580) 2021-01-19 21:12:05 UTC
Red Hat Issue Tracker RHCEPH-3585 0 None None None 2022-02-28 16:37:48 UTC
Red Hat Product Errata RHBA-2020:2385 0 None None None 2020-06-03 16:22:32 UTC

Description Tom Barron 2020-01-31 20:41:07 UTC
+++ This bug was initially created as a clone of Bug #1797047 +++

Description of problem:

## description from original bug in OSP Manila follows, but I repeat here the ganesha log from the ceph-nfs container for clarity:

<log>
Jan 31 19:34:06 controller-0 podman[272774]: exec: PID 57: spawning /usr/bin/ganesha.nfsd  -F -L STDOUT
   Jan 31 19:34:06 controller-0 podman[272774]: exec: Waiting 57 to quit
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.8.3/src, built at Jan 17 2020 20:56:03 on
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] load_rados_config :CONFIG :CRIT :Unknown urls backend
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :NFS STARTUP :CRIT :Error (token scan) while parsing (/etc/ganesha/ganesha.conf)
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:24): new url (rados://manila_data/ganesha-export-index) open error (Success), ignored
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :NFS STARTUP :FATAL :Fatal errors.  Server exiting...
</log>

When we've seen this in the past, the ganesha packaged into the ceph-image was compiled without the proper cmake flags required for it to understand the rados url from ganesha.conf.


Original bug description:

I'm unable to allow/deny access or mount shares with an OSP 16 (GA candidate) puddle [1] and RHCS 4.0 [2]. When I query nfs server status from an OSP controller node the nfs-ganesha server doesn't seem to respond:

   [root@controller-0 manila]# rpcinfo -T tcp 172.17.5.126 100003
   rpcinfo: RPC: Program not registered


Version-Release number of selected component (if applicable): 16.0

[1] RHOS_TRUNK-16.0-RHEL-8-20200130.n.0
[2] Ceph Image details:
    "Labels": {
                "CEPH_POINT_RELEASE": "",
                "GIT_BRANCH": "stable-4.0",
                "GIT_CLEAN": "True",
                "GIT_COMMIT": "376b3b9a129c6fe1a1d081711fa662ccfd657452",
                "GIT_REPO": "https://github.com/ceph/ceph-container.git",
                "RELEASE": "stable-4.0",
                "architecture": "x86_64",
                "authoritative-source-url": "registry.access.redhat.com",
                "build-date": "2020-01-20T23:01:19.600649",
                "com.redhat.build-host": "cpt-1007.osbs.prod.upshift.rdu2.redhat.com",
                "com.redhat.component": "rhceph-container",
                "com.redhat.license_terms": "https://www.redhat.com/en/about/red-hat-end-user-license-agreements",
                "description": "Red Hat Ceph Storage 4",
                "distribution-scope": "public",
                "io.k8s.description": "Red Hat Ceph Storage 4",
                "io.k8s.display-name": "Red Hat Ceph Storage 4 on RHEL 8",
                "io.openshift.expose-services": "",
                "io.openshift.tags": "rhceph ceph",
                "maintainer": "Dimitri Savineau <dsavinea>",
                "name": "rhceph",
                "release": "121.20200120.ci.1",
                "summary": "Provides the latest Red Hat Ceph Storage 4 on RHEL 8 in a fully featured and supported base image.",
                "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/rhceph/images/4-121.20200120.ci.1",
                "vcs-ref": "76bcc9029f35fc0bef5e4ab813a23fe95d3ad2e1",
                "vcs-type": "git",
                "vendor": "Red Hat, Inc.",
                "version": "4"
            },

How reproducible: Always


Steps to Reproduce:
1. Deploy RHOSP 16 beta (GA Candidate if you will), with manila and ceph via nfs-ganesha backend. The ceph image should be the RHCS4.0 (beta versions are available on access.redhat.com/containers). 
2. After the deployment finishes, create a manila share and allow access 


Actual results:

   Access rule transitions to "error" state

Expected results:

   Access rule transitions to "active" state

Additional info:

Check the Triage info in further comments

--- Additional comment from Goutham Pacha Ravi on 2020-01-31 19:42:21 UTC ---

Triage:

  0) Check manila share service logs on the controller hosting the manila-share pacemaker bundle:

    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server [req-260337f6-29ab-4300-8c53-7be6ef02be0a da6f2e04378d4274bf5a5ca166d0e99a 99bffc13f98a43ffa30f12c3a081853d - - -] Exception during message handling: manila.exception.GaneshaCommandFailure: Ganesha management command failed.
Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001)
Exit code: 1
Stdout: ''
Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n'
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 233, in _execut
e
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     return execute(*args, **kwargs)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/utils.py", line 59, in __call__
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     return self.execute(*args, **exkwargs)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/utils.py", line 101, in execute
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     return processutils.execute(*cmd, **kwargs)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 424, in execute
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     cmd=sanitized_cmd)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Exit code: 1
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stdout: ''
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n'
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 474, in add_export
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     "string:EXPORT(Export_Id=%d)" % xid)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     message='dbus call %s.%s' % (service, method), **kwargs)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 242, in _execute
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     cmd=e.cmd)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server manila.exception.GaneshaCommandFailure: Ganesha management command failed.
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Exit code: 1
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stdout: ''
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n'
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/manager.py", line 187, in wrapped
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     return f(self, *args, **kwargs)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/utils.py", line 568, in wrapper
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     return func(self, *args, **kwargs)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/manager.py", line 3554, in update_access
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     share_server=share_server)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 283, in update_access_rules
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     share_server=share_server)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 322, in _update_access_rules
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     share_server)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 390, in _update_rules_through_share_driver
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     share_server=share_server
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 390, in _update_rules_through_share_driver
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     share_server=share_server
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/cephfs/driver.py", line 289, in update_access
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     share_server=share_server)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/__init__.py", line 308, in update_access
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     self.ganesha.add_export(share['name'], confdict)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 491, in add_export
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server     cmd=e.cmd)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server manila.exception.GaneshaCommandFailure: Ganesha management command failed.
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001)
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Exit code: 1
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stdout: ''
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n'
    2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server

  1) Log into one of the controller nodes and check the Podman containers for ceph.  "ceph-nfs-pacemaker" container is likely missing:

     podman ps | grep ceph
b279eaad8f5d  undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest                                                   9 days ago  Up 9 days ago
     ceph-mds-controller-0
289e591ba4ec  undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest                                                   9 days ago  Up 9 days ago
     ceph-mgr-controller-0
6400ff546651  undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest                                                   9 days ago  Up 9 days ago
     ceph-mon-controller-0

    
  You can watch "podman ps" to see "ceph-nfs-pacemaker" container being restarted.

 2) journalctl -u ceph-nfs@pacemaker

   Jan 31 19:34:06 controller-0 podman[272774]: exec: PID 57: spawning /usr/bin/ganesha.nfsd  -F -L STDOUT
   Jan 31 19:34:06 controller-0 podman[272774]: exec: Waiting 57 to quit
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.8.3/src, built at Jan 17 2020 20:56:03 on
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] load_rados_config :CONFIG :CRIT :Unknown urls backend
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :NFS STARTUP :CRIT :Error (token scan) while parsing (/etc/ganesha/ganesha.conf)
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:24): new url (rados://manila_data/ganesha-export-index) open error (Success), ignored
   Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :NFS STARTUP :FATAL :Fatal errors.  Server exiting...
   Jan 31 19:34:06 controller-0 podman[272774]: teardown: managing teardown after SIGCHLD
   Jan 31 19:34:06 controller-0 podman[272774]: teardown: Waiting PID 57 to terminate
   Jan 31 19:34:06 controller-0 podman[272774]: teardown: Process 57 is terminated
   Jan 31 19:34:06 controller-0 podman[272774]: teardown: Bye Bye, container will die    with return code 0
   Jan 31 19:34:06 controller-0 podman[272774]: 2020-01-31 19:34:06.397920413 +0000 UTC m=+5.484116389 container died 6654dd25c3c73da64ea3bcc9fdf52e9900563db4791bda98b74e7d7af22dff8b (image=undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest, name=ceph-nfs-pacemaker)
   Jan 31 19:34:06 controller-0 podman[272774]: 2020-01-31 19:34:06.456853837 +0000 UTC m=+5.543049781 container remove 6654dd25c3c73da64ea3bcc9fdf52e9900563db4791bda98b74e7d7af22dff8b (image=undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest, name=ceph-nfs-pacemaker)
   Jan 31 19:34:06 controller-0 podman[273303]: Error: no container with name or ID ceph-nfs-pacemaker found: no such container

Comment 1 Dimitri Savineau 2020-01-31 21:54:38 UTC
Since the nfs-ganesha 2.8.3 rebase in RHCS 4 the nfs-ganesha package has been split with new packages like nfs-ganesha-rados-urls which contains the libganesha_rados_urls library used for handling RADOS URL configurations.

Comment 26 David Hill 2020-04-02 15:03:47 UTC
May we have the hotfix for my customer then ?

Comment 43 errata-xmlrpc 2020-06-03 16:22:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2385

Comment 44 Red Hat Bugzilla 2023-09-15 00:21:07 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.