Description of problem: We're running OSP 16.1.9. We have some manila shares which are stuck in 'error_deleting' state and cannot be removed unless their status is reset to 'available'. One thing in common with these shares is that the instances associated with all these shares (share_instance_id) do not exist in the environment. Even so, the 'access_rules_status' is still 'active'. After the status is reset to active, the 'access_rules_status' gets set to 'error' state after which we are able to delete it. For example, share ID: 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c $ manila show 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c +---------------------------------------+---------------------------------------------------------------------------+ | Property | Value | +---------------------------------------+---------------------------------------------------------------------------+ | id | 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c | | size | 5 | | availability_zone | nova | | created_at | 2022-12-06T12:31:40.000000 | | status | error_deleting | | name | pvc-ce7985c3-c3d0-493b-aa70-37253922b057 | | description | provisioned-by=manila.csi.openstack.org | | project_id | 7b4c0133359d44a4ba633f7db8357524 | | snapshot_id | None | | share_network_id | None | | share_proto | NFS | | metadata | {'manila.csi.openstack.org/cluster': 'cicd-f58nc'} | | share_type | f6b11e95-1072-44a6-8d89-b660993a505e | | is_public | False | | snapshot_support | True | | task_state | None | | share_type_name | ceph | | access_rules_status | active | | replication_type | None | | has_replicas | False | | user_id | 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 | | create_share_from_snapshot_support | False | | revert_to_snapshot_support | False | | share_group_id | None | | source_share_group_snapshot_member_id | None | | mount_snapshot_support | False | | share_server_id | None | | host | hostgroup@cephfs#cephfs | | export_locations | | | | id = a6b373a8-4c77-42e1-ad03-7dbbab48782a | | | path = 172.16.33.4:/volumes/_nogroup/a257d35b-d254-4dff-9dac-657bcb1625fc | | | preferred = False | | | share_instance_id = a257d35b-d254-4dff-9dac-657bcb1625fc | | | is_admin_only = False | +---------------------------------------+---------------------------------------------------------------------------+ $ manila reset-state --state available 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c $ manila show 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c +---------------------------------------+---------------------------------------------------------------------------+ | Property | Value | +---------------------------------------+---------------------------------------------------------------------------+ | id | 67d12f8b-5bb8-4486-83ce-b3a1b4030a9c | | size | 5 | | availability_zone | nova | | created_at | 2022-12-06T12:31:40.000000 | | status | available | | name | pvc-ce7985c3-c3d0-493b-aa70-37253922b057 | | description | provisioned-by=manila.csi.openstack.org | | project_id | 7b4c0133359d44a4ba633f7db8357524 | | snapshot_id | None | | share_network_id | None | | share_proto | NFS | | metadata | {'manila.csi.openstack.org/cluster': 'cicd-f58nc'} | | share_type | f6b11e95-1072-44a6-8d89-b660993a505e | | is_public | False | | snapshot_support | True | | task_state | None | | share_type_name | ceph | | access_rules_status | error | | replication_type | None | | has_replicas | False | | user_id | 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 | | create_share_from_snapshot_support | False | | revert_to_snapshot_support | False | | share_group_id | None | | source_share_group_snapshot_member_id | None | | mount_snapshot_support | False | | share_server_id | None | | host | hostgroup@cephfs#cephfs | | export_locations | | | | id = a6b373a8-4c77-42e1-ad03-7dbbab48782a | | | path = 172.16.33.4:/volumes/_nogroup/a257d35b-d254-4dff-9dac-657bcb1625fc | | | preferred = False | | | share_instance_id = a257d35b-d254-4dff-9dac-657bcb1625fc | | | is_admin_only = False | +---------------------------------------+---------------------------------------------------------------------------+ Version-Release number of selected component (if applicable): 16.1.9 How reproducible: uknown Steps to Reproduce: 1.unknown 2. 3. Actual results: manial share doesn't delete Expected results: manila share set available if instance is gone and it deletes Additional info: Some of the logs indicate that manila is not able to update the rules of the shares which we are trying to delete where the share_instance_id for that share does not exist in the environment. For example: Share ID: 5b5d2a38-2a46-4d81-8690-ba9cc94629a6 Request ID: req-d7e38b71-85db-444c-8689-f0d6298ffbce ~~~ manila-api.log.2.gz:2023-01-30 10:19:00.712 21 INFO manila.api.openstack.wsgi [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] DELETE https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13786/v2/7b4c0133359d44a4ba633f7db8357524/shares/5b5d2a38-2a46-4d81-8690-ba9cc94629a6 manila-api.log.2.gz:2023-01-30 10:19:00.713 21 INFO manila.api.v1.shares [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Delete share with id: 5b5d2a38-2a46-4d81-8690-ba9cc94629a6 manila-share.log:2023-01-30 10:19:01.742 42 DEBUG manila.share.access [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Updating access rules for share instance cf864d1b-8d5f-4eea-96a0-454ad90ab8b3 belonging to share 5b5d2a38-2a46-4d81-8690-ba9cc94629a6. update_access_rules /usr/lib/python3.6/site-packages/manila/share/access.py:281 manila-share.log:2023-01-30 10:19:02.120 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Running cmd (subprocess): dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406 execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372 manila-share.log:2023-01-30 10:19:02.149 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] CMD "dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406" returned: 1 in 0.029s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409 manila-share.log:2023-01-30 10:19:02.149 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] 'dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406' failed. Not Retrying. execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:457 manila-share.log:2023-01-30 10:19:02.150 42 ERROR manila.share.drivers.ganesha.manager [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] Error while executing management command on Ganesha node <no name>: dbus call exportmgr.RemoveExport.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. ~~~
Thanks for raising the issue! We have encountered cases in the past where Manila share got stuck in 'error_deleting' state: https://bugzilla.redhat.com/show_bug.cgi?id=2081319 https://bugzilla.redhat.com/show_bug.cgi?id=1862931 I guess it would be hard to pinpoint what caused the problem if it's not reproducible.
> One thing in common with these shares is that the instances associated with all these shares (share_instance_id) do not exist in the environment. From what i can see with the limited log excerpt here and the output of the commands, the "share instance" (manila's internal representation of a share object") does exist. You can find the instances on the environment with admin privileges by using the CLI "manila share-instance-list". The "error_deleting" status is probably occurring because of a NFS-Ganesha failure. The DBUS call to RemoveExport is failing: manila-share.log:2023-01-30 10:19:02.149 42 DEBUG oslo_concurrency.processutils [req-d7e38b71-85db-444c-8689-f0d6298ffbce 0a3daf14f6aecb825dbf414fb7eac3504d538f3ad0e53a2d0df84ce2e4ac3af9 7b4c0133359d44a4ba633f7db8357524 - - -] CMD "dbus-send --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:11406" returned: 1 in 0.029s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409 We could debug this by enabling a higher level of logging for Ganesha, but the code looks a bit suspect too. We're missing exception handling around the RemoveExport call - during deletion of a share, we should ignore errors and proceed through the deletion: https://github.com/openstack/manila/blob/d3419ed14ba7780b478deec4028d1f0859c5e801/manila/share/drivers/ganesha/__init__.py#L311
Hello, What do you need to help with this? you asked for better ganesha logging , but we don't see ganesha running anywhere.