Bug 2094822
| Summary: | [CephFS] Clone operations are failing with Assertion Error | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Amarnath <amk> |
| Component: | CephFS | Assignee: | Kotresh HR <khiremat> |
| Status: | CLOSED ERRATA | QA Contact: | Amarnath <amk> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 5.2 | CC: | ceph-eng-bugs, hyelloji, khiremat, tserlin, vshankar |
| Target Milestone: | --- | ||
| Target Release: | 5.3z1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-16.2.10-100.el8cp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-02-28 10:05:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2088698, 2182962 | ||
Tested till 122 clones and we are not seeing the issue
[root@ceph-amk-bootstrap-clx3kj-node7 ~]# ceph fs clone status cephfs clone_status_122
{
"status": {
"state": "complete"
}
}
[root@ceph-amk-bootstrap-clx3kj-node7 ~]# ceph fs subvolume snapshot info cephfs subvol_clone_status snap_1 --group_name subvolgroup_clone_status_1
{
"created_at": "2023-01-23 08:28:43.445930",
"data_pool": "cephfs.cephfs.data",
"has_pending_clones": "no"
}
Cluster has reached full with all the clones
[root@ceph-amk-bootstrap-clx3kj-node7 ~]# ceph -s
cluster:
id: fa611392-9af1-11ed-be7c-fa163e696b4f
health: HEALTH_ERR
2 backfillfull osd(s)
1 full osd(s)
3 nearfull osd(s)
Degraded data redundancy: 632/595707 objects degraded (0.106%), 81 pgs degraded
Full OSDs blocking recovery: 81 pgs recovery_toofull
4 pool(s) full
services:
mon: 3 daemons, quorum ceph-amk-bootstrap-clx3kj-node1-installer,ceph-amk-bootstrap-clx3kj-node2,ceph-amk-bootstrap-clx3kj-node3 (age 42m)
mgr: ceph-amk-bootstrap-clx3kj-node1-installer.aqnxiy(active, since 34h), standbys: ceph-amk-bootstrap-clx3kj-node2.glqgyi
mds: 2/2 daemons up, 1 standby
osd: 12 osds: 12 up (since 42m), 12 in (since 42m)
data:
volumes: 1/1 healthy
pools: 4 pools, 321 pgs
objects: 198.57k objects, 49 GiB
usage: 150 GiB used, 30 GiB / 180 GiB avail
pgs: 632/595707 objects degraded (0.106%)
240 active+clean
81 active+recovery_toofull+degraded
io:
client: 85 B/s rd, 341 B/s wr, 0 op/s rd, 0 op/s wr
progress:
Global Recovery Event (23h)
[====================........] (remaining: 7h)
Versions
[root@ceph-amk-bootstrap-clx3kj-node7 ~]# ceph versions
{
"mon": {
"ceph version 16.2.10-103.el8cp (4a5dd59c2e6616f05cc94e6aab2bddf1339ca4f4) pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.10-103.el8cp (4a5dd59c2e6616f05cc94e6aab2bddf1339ca4f4) pacific (stable)": 2
},
"osd": {
"ceph version 16.2.10-103.el8cp (4a5dd59c2e6616f05cc94e6aab2bddf1339ca4f4) pacific (stable)": 12
},
"mds": {
"ceph version 16.2.10-103.el8cp (4a5dd59c2e6616f05cc94e6aab2bddf1339ca4f4) pacific (stable)": 3
},
"overall": {
"ceph version 16.2.10-103.el8cp (4a5dd59c2e6616f05cc94e6aab2bddf1339ca4f4) pacific (stable)": 20
}
}
[root@ceph-amk-bootstrap-clx3kj-node7 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Ceph Storage 5.3 Bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:0980 |
Description of problem: Clone operations are failing with Assertion Error. When we create more clone in my case i have created 130[root@ceph-amk-bz-2-qa3ps0-node7 _nogroup]# ceph fs clone status cephfs clone_status_142 Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 96, in get_subvolume_object self.upgrade_to_v2_subvolume(subvolume) File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 57, in upgrade_to_v2_subvolume version = int(subvolume.metadata_mgr.get_global_option('version')) File "/usr/share/ceph/mgr/volumes/fs/operations/versions/metadata_manager.py", line 144, in get_global_option return self.get_option(MetadataManager.GLOBAL_SECTION, key) File "/usr/share/ceph/mgr/volumes/fs/operations/versions/metadata_manager.py", line 138, in get_option raise MetadataMgrException(-errno.ENOENT, "section '{0}' does not exist".format(section)) volumes.fs.exception.MetadataMgrException: -2 (section 'GLOBAL' does not exist) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1446, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/volumes/module.py", line 437, in handle_command return handler(inbuf, cmd) File "/usr/share/ceph/mgr/volumes/module.py", line 34, in wrap return f(self, inbuf, cmd) File "/usr/share/ceph/mgr/volumes/module.py", line 682, in _cmd_fs_clone_status vol_name=cmd['vol_name'], clone_name=cmd['clone_name'], group_name=cmd.get('group_name', None)) File "/usr/share/ceph/mgr/volumes/fs/volume.py", line 622, in clone_status with open_subvol(self.mgr, fs_handle, self.volspec, group, clonename, SubvolumeOpType.CLONE_STATUS) as subvolume: File "/lib64/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/usr/share/ceph/mgr/volumes/fs/operations/subvolume.py", line 72, in open_subvol subvolume = loaded_subvolumes.get_subvolume_object(mgr, fs, vol_spec, group, subvolname) File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 101, in get_subvolume_object self.upgrade_legacy_subvolume(fs, subvolume) File "/usr/share/ceph/mgr/volumes/fs/operations/versions/__init__.py", line 78, in upgrade_legacy_subvolume assert subvolume.legacy_mode AssertionError Version-Release number of selected component (if applicable): [root@ceph-amk-bz-1-wu0ar7-node7 ~]# ceph versions { "mon": { "ceph version 16.2.8-27.el8cp (b0bd3a6c6f24d3ac855dde96982871257bef866f) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.8-27.el8cp (b0bd3a6c6f24d3ac855dde96982871257bef866f) pacific (stable)": 2 }, "osd": { "ceph version 16.2.8-27.el8cp (b0bd3a6c6f24d3ac855dde96982871257bef866f) pacific (stable)": 12 }, "mds": { "ceph version 16.2.8-27.el8cp (b0bd3a6c6f24d3ac855dde96982871257bef866f) pacific (stable)": 3 }, "overall": { "ceph version 16.2.8-27.el8cp (b0bd3a6c6f24d3ac855dde96982871257bef866f) pacific (stable)": 20 } } [root@ceph-amk-bz-1-wu0ar7-node7 ~]# How reproducible: 1/1 Steps to Reproduce: Create Subvolumegroup ceph fs subvolumegroup create cephfs subvolgroup_clone_status_1 Create Subvolume ceph fs subvolume create cephfs subvol_clone_status --size 5368706371 --group_name subvolgroup_clone_status_1 Kernel mount the volume and fill data Create Snapshot ceph fs subvolume snapshot create cephfs subvol_clone_status snap_1 --group_name subvolgroup_clone_status_1 Create 200 Clones out of the above subvolume ceph fs subvolume snapshot clone cephfs subvol_clone_status snap_1 clone_status_1 --group_name subvolgroup_clone_status_1 Actual results: Expected results: Should fail gracefully Additional info: