Bug 2042320 - CephFS: Log Failure details if subvolume clone fails.
Summary: CephFS: Log Failure details if subvolume clone fails.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 5.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 5.2
Assignee: Kotresh HR
QA Contact: Amarnath
Akash Raj
URL:
Whiteboard:
: 2047570 (view as bug list)
Depends On:
Blocks: 2042318 2102272
TreeView+ depends on / blocked
 
Reported: 2022-01-19 09:03 UTC by Madhu Rajanna
Modified: 2024-03-20 06:40 UTC (History)
15 users (show)

Fixed In Version: ceph-16.2.8-6.el8cp
Doc Type: Enhancement
Doc Text:
.Reason for clone failure shows up when using `clone status` command Previously, whenever a clone failed, the only way to check the reason for failure was by looking into the logs. With this release, the reason for clone failure is shown in the output of the `clone status` command: .Example ---- [ceph: root@host01 /]# ceph fs clone status cephfs clone1 { "status": { "state": "failed", "source": { "volume": "cephfs", "subvolume": "subvol1", "snapshot": "snap1" "size": "104857600" }, "failure": { "errno": "122", "errstr": "Disk quota exceeded" } } } ---- The reason for a clone failure is shown in two fields: - `errno` : error number - `error_msg` : failure error string
Clone Of: 2042318
Environment:
Last Closed: 2022-08-09 17:37:27 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 55190 0 None None None 2022-04-14 13:35:03 UTC
Red Hat Issue Tracker RHCEPH-2997 0 None None None 2022-01-19 09:11:41 UTC
Red Hat Product Errata RHSA-2022:5997 0 None None None 2022-08-09 17:38:03 UTC

Comment 16 Mudit Agarwal 2022-05-30 15:41:29 UTC
*** Bug 2047570 has been marked as a duplicate of this bug. ***

Comment 17 Amarnath 2022-05-31 10:20:59 UTC
Hi Kotresh,

I was trying to reproduce this bug.
What would be the way i can fail a clone operation once it is initiated.
I tried cloning of a volume till i exhaust storage but it is not failing the clone operation instead it is bringing down ceph cluster and seeing below error.


[root@ceph-amk-bz-2-qa3ps0-node7 ~]# ceph fs clone status cephfs clone_status_142
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 96, in get_subvolume_object
    self.upgrade_to_v2_subvolume(subvolume)
  File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 57, in upgrade_to_v2_subvolume
    version = int(subvolume.metadata_mgr.get_global_option('version'))
  File "/usr/share/ceph/mgr/volumes/fs/operations/versions/metadata_manager.py", line 144, in get_global_option
    return self.get_option(MetadataManager.GLOBAL_SECTION, key)
  File "/usr/share/ceph/mgr/volumes/fs/operations/versions/metadata_manager.py", line 138, in get_option
    raise MetadataMgrException(-errno.ENOENT, "section '{0}' does not exist".format(section))
volumes.fs.exception.MetadataMgrException: -2 (section 'GLOBAL' does not exist)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1446, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/volumes/module.py", line 437, in handle_command
    return handler(inbuf, cmd)
  File "/usr/share/ceph/mgr/volumes/module.py", line 34, in wrap
    return f(self, inbuf, cmd)
  File "/usr/share/ceph/mgr/volumes/module.py", line 682, in _cmd_fs_clone_status
    vol_name=cmd['vol_name'], clone_name=cmd['clone_name'],  group_name=cmd.get('group_name', None))
  File "/usr/share/ceph/mgr/volumes/fs/volume.py", line 622, in clone_status
    with open_subvol(self.mgr, fs_handle, self.volspec, group, clonename, SubvolumeOpType.CLONE_STATUS) as subvolume:
  File "/lib64/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/share/ceph/mgr/volumes/fs/operations/subvolume.py", line 72, in open_subvol
    subvolume = loaded_subvolumes.get_subvolume_object(mgr, fs, vol_spec, group, subvolname)
  File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 101, in get_subvolume_object
    self.upgrade_legacy_subvolume(fs, subvolume)
  File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 78, in upgrade_legacy_subvolume
    assert subvolume.legacy_mode
AssertionError

Is there an way i can test this BZ

Comment 20 Amarnath 2022-06-02 07:01:03 UTC
We are able to see the failure message whenever clone operation is canceled or errored out 

[root@ceph-amk-bz-1-wu0ar7-node7 ~]# ceph fs subvolume snapshot clone cephfs subvol_clone_status snap_1 clone_status_10 --group_name subvolgroup_clone_status_1
[root@ceph-amk-bz-1-wu0ar7-node7 ~]# ceph fs clone status cephfs clone_status_10
{
  "status": {
    "state": "in-progress",
    "source": {
      "volume": "cephfs",
      "subvolume": "subvol_clone_status",
      "snapshot": "snap_1",
      "group": "subvolgroup_clone_status_1"
    }
  }
}
[root@ceph-amk-bz-1-wu0ar7-node7 ~]# ceph fs clone cancel cephfs clone_status_10
[root@ceph-amk-bz-1-wu0ar7-node7 ~]# ceph fs clone status cephfs clone_status_10
{
  "status": {
    "state": "canceled",
    "source": {
      "volume": "cephfs",
      "subvolume": "subvol_clone_status",
      "snapshot": "snap_1",
      "group": "subvolgroup_clone_status_1"
    },
    "failure": {
      "errno": "4",
      "error_msg": "user interrupted clone operation"
    }
  }
}
[root@ceph-amk-bz-1-wu0ar7-node7 ~]# 

Regards,
Amarnath

Comment 23 Kotresh HR 2022-07-14 12:26:26 UTC
nit:
----
Previously, whenever a clone failed, the only way to check the reason for failure was by looking into the logs. T
With this release, the reason for clone failure is shown in the output of the `clone status` command:
----

There is an extra 'T' hanging in there after the first sentence ?

Looks good to me otherwise.

Comment 25 errata-xmlrpc 2022-08-09 17:37:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5997


Note You need to log in before you can comment on or make changes to this bug.