Description of problem: We're running OSP 16.1.9. We have some manila shares which are stuck in 'error_deleting' state and cannot be removed unless their status is reset to 'available'. One thing in common with these shares is that the instances associated with all these shares (share_instance_id) do not exist in the environment. Even so, the 'access_rules_status' is still 'active'. After the status is reset to active, the 'access_rules_status' gets set to 'error' state after which we are able to delete it. For example, share ID: 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c $ manila show 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c +---------------------------------------+---------------------------------------------------------------------------+ | Property | Value | +---------------------------------------+---------------------------------------------------------------------------+ | id | 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c | | size | 5 | | availability_zone | nova | | created_at | 2022-12-06T12:31:40.000000 | | status | error_deleting | | name | pvc-ce7985c3-c3d0-493b-aa70-37253922b057 | | description | provisioned-by=manila.csi.openstack.org | | project_id | 7b4c0133359d44a4ba633f7db8357524 | | snapshot_id | None | | share_network_id | None | | share_proto | NFS | | metadata | {'manila.csi.openstack.org/cluster': 'cicd-f58nc'} | | share_type | f6b11e95-1072-44a6-8d89-b660993a505e | | is_public | False | | snapshot_support | True | | task_state | None | | share_type_name | ceph | | access_rules_status | active | | replication_type | None | | has_replicas | False | | user_id | 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 | | create_share_from_snapshot_support | False | | revert_to_snapshot_support | False | | share_group_id | None | | source_share_group_snapshot_member_id | None | | mount_snapshot_support | False | | share_server_id | None | | host | hostgroup@cephfs#cephfs | | export_locations | | | | id = a6b373a8-4c77-42e1-ad03-7dbbab48782a | | | path = 172.16.33.4:/volumes/_nogroup/a257d35b-d254-4dff-9dac-657bcb1625fc | | | preferred = False | | | share_instance_id = a257d35b-d254-4dff-9dac-657bcb1625fc | | | is_admin_only = False | +---------------------------------------+---------------------------------------------------------------------------+ $ manila reset-state --state available 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c $ manila show 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c +---------------------------------------+---------------------------------------------------------------------------+ | Property | Value | +---------------------------------------+---------------------------------------------------------------------------+ | id | 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c | | size | 5 | | availability_zone | nova | | created_at | 2022-12-06T12:31:40.000000 | | status | available | | name | pvc-ce7985c3-c3d0-493b-aa70-37253922b057 | | description | provisioned-by=manila.csi.openstack.org | | project_id | 7b4c0133359d44a4ba633f7db8357524 | | snapshot_id | None | | share_network_id | None | | share_proto | NFS | | metadata | {'manila.csi.openstack.org/cluster': 'cicd-f58nc'} | | share_type | f6b11e95-1072-44a6-8d89-b660993a505e | | is_public | False | | snapshot_support | True | | task_state | None | | share_type_name | ceph | | access_rules_status | error | | replication_type | None | | has_replicas | False | | user_id | 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 | | create_share_from_snapshot_support | False | | revert_to_snapshot_support | False | | share_group_id | None | | source_share_group_snapshot_member_id | None | | mount_snapshot_support | False | | share_server_id | None | | host | hostgroup@cephfs#cephfs | | export_locations | | | | id = a6b373a8-4c77-42e1-ad03-7dbbab48782a | | | path = 172.16.33.4:/volumes/_nogroup/a257d35b-d254-4dff-9dac-657bcb1625fc | | | preferred = False | | | share_instance_id = a257d35b-d254-4dff-9dac-657bcb1625fc | | | is_admin_only = False | +---------------------------------------+---------------------------------------------------------------------------+ Version-Release number of selected component (if applicable): 16.1.9 How reproducible: uknown Steps to Reproduce: 1.unknown 2. 3. Actual results: manial share doesn't delete Expected results: manila share set available if instance is gone and it deletes Additional info: Some of the logs indicate that manila is not able to update the rules of the shares which we are trying to delete where the share_instance_id for that share does not exist in the environment. For example: Share ID: 5b5d2a38-2a46-4d81-8690-ba9cc94629a6 Request ID: req-d7e38b71-85db-444c-8689-f0d6298ffbce ~~~ manila-api.log.2.gz:2023-01-30 10:19:00.712 21 INFO manila.api.openstack.wsgi [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] DELETE https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13786/v2/7b4c0133359d44a4ba633f7db8357524/shares/5b5d2a38-2a46-4d81-8690-ba9cc94629a6 manila-api.log.2.gz:2023-01-30 10:19:00.713 21 INFO manila.api.v1.shares [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Delete share with id: 5b5d2a38-2a46-4d81-8690-ba9cc94629a6 manila-share.log:2023-01-30 10:19:01.742 42 DEBUG manila.share.access [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Updating access rules for share instance cf864d1b-8d5f-4eea-96a0-454ad90ab8b3 belonging to share 5b5d2a38-2a46-4d81-8690-ba9cc94629a6. update_access_rules /usr/lib/python3.6/site-packages/manila/share/access.py:281 manila-share.log:2023-01-30 10:19:02.120 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Running cmd (subprocess): dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406 execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372 manila-share.log:2023-01-30 10:19:02.149 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] CMD "dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406" returned: 1 in 0.029s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409 manila-share.log:2023-01-30 10:19:02.149 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] 'dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406' failed. Not Retrying. execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:457 manila-share.log:2023-01-30 10:19:02.150 42 ERROR manila.share.drivers.ganesha.manager [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Error while executing management command on Ganesha node <no name>: dbus call exportmgr.RemoveExport.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. ~~~
Thanks for raising the issue! We have encountered cases in the past where Manila share got stuck in 'error_deleting' state: https://bugzilla.redhat.com/show_bug.cgi?id=2081319 https://bugzilla.redhat.com/show_bug.cgi?id=1862931 I guess it would be hard to pinpoint what caused the problem if it's not reproducible.
> One thing in common with these shares is that the instances associated with all these shares (share_instance_id) do not exist in the environment. From what i can see with the limited log excerpt here and the output of the commands, the "share instance" (manila's internal representation of a share object") does exist. You can find the instances on the environment with admin privileges by using the CLI "manila share-instance-list". The "error_deleting" status is probably occurring because of a NFS-Ganesha failure. The DBUS call to RemoveExport is failing: manila-share.log:2023-01-30 10:19:02.149 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] CMD "dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406" returned: 1 in 0.029s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409 We could debug this by enabling a higher level of logging for Ganesha, but the code looks a bit suspect too. We're missing exception handling around the RemoveExport call - during deletion of a share, we should ignore errors and proceed through the deletion: https://github.com/openstack/manila/blob/d3419ed14ba7780b478deec4028d1f0859c5e801/manila/share/drivers/ganesha/__init__.py#L311
Hello, What do you need to help with this? you asked for better ganesha logging , but we don't see ganesha running anywhere.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.2.6 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:1519