Description of problem: I'm unable to allow/deny access or mount shares with an OSP 16 (GA candidate) puddle [1] and RHCS 4.0 [2]. When I query nfs server status from an OSP controller node the nfs-ganesha server doesn't seem to respond: [root@controller-0 manila]# rpcinfo -T tcp 172.17.5.126 100003 rpcinfo: RPC: Program not registered Version-Release number of selected component (if applicable): 16.0 [1] RHOS_TRUNK-16.0-RHEL-8-20200130.n.0 [2] Ceph Image details: "Labels": { "CEPH_POINT_RELEASE": "", "GIT_BRANCH": "stable-4.0", "GIT_CLEAN": "True", "GIT_COMMIT": "376b3b9a129c6fe1a1d081711fa662ccfd657452", "GIT_REPO": "https://github.com/ceph/ceph-container.git", "RELEASE": "stable-4.0", "architecture": "x86_64", "authoritative-source-url": "registry.access.redhat.com", "build-date": "2020-01-20T23:01:19.600649", "com.redhat.build-host": "cpt-1007.osbs.prod.upshift.rdu2.redhat.com", "com.redhat.component": "rhceph-container", "com.redhat.license_terms": "https://www.redhat.com/en/about/red-hat-end-user-license-agreements", "description": "Red Hat Ceph Storage 4", "distribution-scope": "public", "io.k8s.description": "Red Hat Ceph Storage 4", "io.k8s.display-name": "Red Hat Ceph Storage 4 on RHEL 8", "io.openshift.expose-services": "", "io.openshift.tags": "rhceph ceph", "maintainer": "Dimitri Savineau <dsavinea>", "name": "rhceph", "release": "121.20200120.ci.1", "summary": "Provides the latest Red Hat Ceph Storage 4 on RHEL 8 in a fully featured and supported base image.", "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/rhceph/images/4-121.20200120.ci.1", "vcs-ref": "76bcc9029f35fc0bef5e4ab813a23fe95d3ad2e1", "vcs-type": "git", "vendor": "Red Hat, Inc.", "version": "4" }, How reproducible: Always Steps to Reproduce: 1. Deploy RHOSP 16 beta (GA Candidate if you will), with manila and ceph via nfs-ganesha backend. The ceph image should be the RHCS4.0 (beta versions are available on access.redhat.com/containers). 2. After the deployment finishes, create a manila share and allow access Actual results: Access rule transitions to "error" state Expected results: Access rule transitions to "active" state Additional info: Check the Triage info in further comments
Triage: 0) Check manila share service logs on the controller hosting the manila-share pacemaker bundle: 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server [req-260337f6-29ab-4300-8c53-7be6ef02be0a da6f2e04378d4274bf5a5ca166d0e99a 99bffc13f98a43ffa30f12c3a081853d - - -] Exception during message handling: manila.exception.GaneshaCommandFailure: Ganesha management command failed. Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001) Exit code: 1 Stdout: '' Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 233, in _execut e 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server return execute(*args, **kwargs) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/utils.py", line 59, in __call__ 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server return self.execute(*args, **exkwargs) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/utils.py", line 101, in execute 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server return processutils.execute(*cmd, **kwargs) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 424, in execute 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server cmd=sanitized_cmd) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Exit code: 1 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stdout: '' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 474, in add_export 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server "string:EXPORT(Export_Id=%d)" % xid) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server message='dbus call %s.%s' % (service, method), **kwargs) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 242, in _execute 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server cmd=e.cmd) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server manila.exception.GaneshaCommandFailure: Ganesha management command failed. 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Exit code: 1 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stdout: '' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/manager.py", line 187, in wrapped 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server return f(self, *args, **kwargs) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/utils.py", line 568, in wrapper 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server return func(self, *args, **kwargs) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/manager.py", line 3554, in update_access 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server share_server=share_server) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 283, in update_access_rules 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server share_server=share_server) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 322, in _update_access_rules 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server share_server) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 390, in _update_rules_through_share_driver 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server share_server=share_server 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/access.py", line 390, in _update_rules_through_share_driver 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server share_server=share_server 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/cephfs/driver.py", line 289, in update_access 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server share_server=share_server) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/__init__.py", line 308, in update_access 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server self.ganesha.add_export(share['name'], confdict) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/manila/share/drivers/ganesha/manager.py", line 491, in add_export 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server cmd=e.cmd) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server manila.exception.GaneshaCommandFailure: Ganesha management command failed. 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Command: dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/etc/ganesha/export.d/share-5036e505-be7b-42c0-81f7-12844726781e.conf.pbnc57 string:EXPORT(Export_Id=1001) 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Exit code: 1 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stdout: '' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server Stderr: 'Error org.freedesktop.DBus.Error.ServiceUnknown: The name org.ganesha.nfsd was not provided by any .service files\n' 2020-01-31 19:01:16.739 42 ERROR oslo_messaging.rpc.server 1) Log into one of the controller nodes and check the Podman containers for ceph. "ceph-nfs-pacemaker" container is likely missing: podman ps | grep ceph b279eaad8f5d undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest 9 days ago Up 9 days ago ceph-mds-controller-0 289e591ba4ec undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest 9 days ago Up 9 days ago ceph-mgr-controller-0 6400ff546651 undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest 9 days ago Up 9 days ago ceph-mon-controller-0 You can watch "podman ps" to see "ceph-nfs-pacemaker" container being restarted. 2) journalctl -u ceph-nfs@pacemaker Jan 31 19:34:06 controller-0 podman[272774]: exec: PID 57: spawning /usr/bin/ganesha.nfsd -F -L STDOUT Jan 31 19:34:06 controller-0 podman[272774]: exec: Waiting 57 to quit Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.8.3/src, built at Jan 17 2020 20:56:03 on Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] load_rados_config :CONFIG :CRIT :Unknown urls backend Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :NFS STARTUP :CRIT :Error (token scan) while parsing (/etc/ganesha/ganesha.conf) Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:24): new url (rados://manila_data/ganesha-export-index) open error (Success), ignored Jan 31 19:34:06 controller-0 podman[272774]: 31/01/2020 19:34:06 : epoch 5e34812e : controller-0 : ganesha.nfsd-57[main] main :NFS STARTUP :FATAL :Fatal errors. Server exiting... Jan 31 19:34:06 controller-0 podman[272774]: teardown: managing teardown after SIGCHLD Jan 31 19:34:06 controller-0 podman[272774]: teardown: Waiting PID 57 to terminate Jan 31 19:34:06 controller-0 podman[272774]: teardown: Process 57 is terminated Jan 31 19:34:06 controller-0 podman[272774]: teardown: Bye Bye, container will die with return code 0 Jan 31 19:34:06 controller-0 podman[272774]: 2020-01-31 19:34:06.397920413 +0000 UTC m=+5.484116389 container died 6654dd25c3c73da64ea3bcc9fdf52e9900563db4791bda98b74e7d7af22dff8b (image=undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest, name=ceph-nfs-pacemaker) Jan 31 19:34:06 controller-0 podman[272774]: 2020-01-31 19:34:06.456853837 +0000 UTC m=+5.543049781 container remove 6654dd25c3c73da64ea3bcc9fdf52e9900563db4791bda98b74e7d7af22dff8b (image=undercloud-0.ctlplane.redhat.local:8787/ceph/rhceph-4.0-rhel8:latest, name=ceph-nfs-pacemaker) Jan 31 19:34:06 controller-0 podman[273303]: Error: no container with name or ID ceph-nfs-pacemaker found: no such container
Hi, I also experienced the same problem today, below are my env and logs cat containers-prepare-parameter.yaml |grep ceph ceph_image: rhceph-4-rhel8 ceph_namespace: registry.redhat.io/rhceph ceph_tag: latest podman images REPOSITORY TAG IMAGE ID CREATED SIZE director.ctlplane.localdomain:8787/rhceph/rhceph-4-rhel8 4-20 011ee108bfc9 2 weeks ago 1.01 GB podman exec -it afcd980d2914 ceph --version ceph version 14.2.4-125.el8cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable) podman ps -a |grep ceph 273261c2a995 director.ctlplane.localdomain:8787/rhceph/rhceph-4-rhel8:4-20 3 hours ago Up 3 hours ago ceph-mds-servercon01 b0955cb2afc5 director.ctlplane.localdomain:8787/rhceph/rhceph-4-rhel8:4-20 3 hours ago Up 3 hours ago ceph-mgr-servercon01 afcd980d2914 director.ctlplane.localdomain:8787/rhceph/rhceph-4-rhel8:4-20 3 hours ago Up 3 hours ago ceph-mon-servercon01 journalctl -u ceph-nfs@pacemaker -- Logs begin at Thu 2020-04-16 21:53:38 WIB, end at Fri 2020-04-17 08:06:23 WIB. -- Apr 17 05:53:06 servercon01 systemd[1]: Starting Cluster Controlled ceph-nfs@pacemaker... Apr 17 05:53:06 servercon01 podman[211408]: Error: no container with name or ID ceph-nfs-pacemaker found: no such container Apr 17 05:53:06 servercon01 systemd[1]: Started Cluster Controlled ceph-nfs@pacemaker. Apr 17 05:53:06 servercon01 podman[211434]: 2020-04-17 05:53:06.768784491 +0700 WIB m=+0.070863468 container create 6db64e07aa7894e0be73207d71da9be764182acb27f60107984b17de8be40926 (image=serverrhdir01.ctlplane.localdomain:8787/rhceph/> Apr 17 05:53:06 servercon01 podman[211434]: 2020-04-17 05:53:06.855343008 +0700 WIB m=+0.157422033 container init 6db64e07aa7894e0be73207d71da9be764182acb27f60107984b17de8be40926 (image=serverrhdir01.ctlplane.localdomain:8787/rhceph/rh> Apr 17 05:53:06 servercon01 podman[211434]: 2020-04-17 05:53:06.868005326 +0700 WIB m=+0.170084307 container start 6db64e07aa7894e0be73207d71da9be764182acb27f60107984b17de8be40926 (image=serverrhdir01.ctlplane.localdomain:8787/rhceph/r> Apr 17 05:53:06 servercon01 podman[211434]: 2020-04-17 05:53:06.868078381 +0700 WIB m=+0.170157359 container attach 6db64e07aa7894e0be73207d71da9be764182acb27f60107984b17de8be40926 (image=serverrhdir01.ctlplane.localdomain:8787/rhceph/> Apr 17 05:53:06 servercon01 podman[211434]: 2020-04-17 05:53:06 /opt/ceph-container/bin/entrypoint.sh: static: does not generate config Apr 17 05:53:07 servercon01 podman[211434]: HEALTH_OK Apr 17 05:53:07 servercon01 podman[211434]: 2020-04-17 05:53:07 /opt/ceph-container/bin/entrypoint.sh: SUCCESS Apr 17 05:53:07 servercon01 podman[211434]: exec: PID 110: spawning /usr/bin/ganesha.nfsd -F -L STDOUT Apr 17 05:53:07 servercon01 podman[211434]: exec: Waiting 110 to quit Apr 17 05:53:07 servercon01 podman[211434]: 17/04/2020 05:53:07 : epoch 5e98e1d3 : servercon01 : ganesha.nfsd-110[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.8.3/src, built at Jan> Apr 17 05:53:07 servercon01 podman[211434]: 17/04/2020 05:53:07 : epoch 5e98e1d3 : servercon01 : ganesha.nfsd-110[main] load_rados_config :CONFIG :CRIT :Unknown urls backend Apr 17 05:53:07 servercon01 podman[211434]: 17/04/2020 05:53:07 : epoch 5e98e1d3 : servercon01 : ganesha.nfsd-110[main] main :NFS STARTUP :CRIT :Error (token scan) while parsing (/etc/ganesha/ganesha.conf) Apr 17 05:53:07 servercon01 podman[211434]: 17/04/2020 05:53:07 : epoch 5e98e1d3 : servercon01 : ganesha.nfsd-110[main] config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:24): new url (rados://manila_data/ganesha-> Apr 17 05:53:07 servercon01 podman[211434]: 17/04/2020 05:53:07 : epoch 5e98e1d3 : servercon01 : ganesha.nfsd-110[main] main :NFS STARTUP :FATAL :Fatal errors. Server exiting... Apr 17 05:53:07 servercon01 podman[211434]: teardown: managing teardown after SIGCHLD Apr 17 05:53:07 servercon01 podman[211434]: teardown: Waiting PID 110 to terminate Apr 17 05:53:07 servercon01 podman[211434]: teardown: Process 110 is terminated Apr 17 05:53:07 servercon01 podman[211434]: teardown: Bye Bye, container will die with return code 0 Apr 17 05:53:07 servercon01 podman[211434]: 2020-04-17 05:53:07.832934563 +0700 WIB m=+1.135013603 container died 6db64e07aa7894e0be73207d71da9be764182acb27f60107984b17de8be40926 (image=serverrhdir01.ctlplane.localdomain:8787/rhceph/rh> Apr 17 05:53:07 servercon01 podman[211434]: 2020-04-17 05:53:07.903792947 +0700 WIB m=+1.205871983 container remove 6db64e07aa7894e0be73207d71da9be764182acb27f60107984b17de8be40926 (image=serverrhdir01.ctlplane.localdomain:8787/rhceph/> Apr 17 05:53:07 servercon01 podman[211748]: Error: no container with name or ID ceph-nfs-pacemaker found: no such container ```
(In reply to aryulianto from comment #10) > Hi, I also experienced the same problem today, below are my env and logs hi, thanks for taking the time to update the bug with additional details and findings the bug will be resolved with the next update for rhcs, 4.1 which will include the fixes for [1] and [2] 1. https://bugzilla.redhat.com/show_bug.cgi?id=1797075 2. https://bugzilla.redhat.com/show_bug.cgi?id=1822328
> the bug will be resolved with the next update for rhcs, 4.1 which will include the fixes for [1] and [2] Hi Giulio Fidente, thank you for the prompt response, do you know estimated when rhcs 4.1 will be released? cause we need this feature for our osp16 deployment Regards,
RHCSv4.1 is currently planned for May 4th, it will follow OSP 16.0.2 planned for April 23rd