Bug 2053706
| Summary: | [RFE] ceph snapshots datestamps lack a timezone field | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | George Law <glaw> |
| Component: | CephFS | Assignee: | Milind Changire <mchangir> |
| Status: | CLOSED ERRATA | QA Contact: | Amarnath <amk> |
| Severity: | medium | Docs Contact: | Akash Raj <akraj> |
| Priority: | unspecified | ||
| Version: | 5.0 | CC: | akraj, amk, ceph-eng-bugs, gfarnum, hyelloji, kdreyer, linuxkidd, mchangir, ngangadh, pdhange, pdonnell, stockage-dnum, vereddy, vshankar |
| Target Milestone: | --- | Keywords: | FutureFeature |
| Target Release: | 5.2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-16.2.8-8.el8cp | Doc Type: | Known Issue |
| Doc Text: |
.The `getpath` command causes automation failure
An assumption that the directory name returned by the `getpath` command is the directory under which snapshots would be created causes automation failure and confusion.
As a workaround, it is recommended to use the directory path that is one level higher, to add to the `snap-schedule add` command. Snapshots are available one level higher than at the level returned by the `getpath` command.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-09 17:37:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2102272 | ||
|
Description
George Law
2022-02-11 19:31:06 UTC
(In reply to George Law from comment #0) > Description of problem: > > mgr snap-schedule module - snap shots all seem to happen on GMT regardless > of the host machines TZ What tz is the ceph-mgr container configured with? From bz#2023169, it sounds like containers use UTC tz. snap_schedule manager plugin uses python time.strftime https://github.com/ceph/ceph/blob/master/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L272 which would use the default tz the environment is configured with. > > Customer case : 03135925 (and also data available on 03138550) > > Customer's cluster is on CET, so UTC+1 > > > [root@t-vulpes-mds1 ~]# date; date -u; ls -l /mnt/test_root/tz_test/.snap > > Tue Feb 8 10:59:09 CET 2022 > > Tue Feb 8 09:59:09 UTC 2022 > > His expectation is that the snapshots will be named according to the > container host TZ . > > E.g the 08:00 snapshot would be named scheduled-2022-02-10-08_00_00 but > instead it ends up named scheduled-2022-02-10-07_00_00 > > If not considered a bug, please consider this a RFE to at least add the TZ > into the snapshot name so that if a snapshot is needed, it can be correctly > identified. +1 - sounds reasonable. > > Troubleshooting : > > I had him enable the snap-manager logging - e.g. > # ceph config set mgr mgr/snap_schedule/log_level debug > # ceph config set mgr mgr/snap_schedule/log_to_file true > > and the timestamps in the ceph-mfg logs in /var/log/ceph/<fs_id>/ on the > active manager node show GMT > > I also tried this in a lab with machines on EST - and although my snapshot > failed, it appears my 17:00 snapshot would have been named > /mnt/cephfs/.snap/scheduled-2022-02-10-22_00_00 instead of the expected > scheduled-2022-02-10-17_00_00 > > > root@ceph-client .snap]# date > Thu Feb 10 17:10:20 EST 2022 > root@ceph-client .snap]# date -u > Thu Feb 10 22:12:22 UTC 2022 What does data/date -u on the ceph-mgr container show in the test lab node? > > 2022-02-10 22:00:01,571 [Thread-4] [DEBUG] [mgr_util] Connection to cephfs > 'cephfs' complete > 2022-02-10 22:00:01,574 [Thread-4] [DEBUG] [mgr_util] [put] connection: > <mgr_util.CephfsConnectionPool.Connection object at 0x7f6ffea149e8> usage: 1 > 2022-02-10 22:00:01,574 [Thread-4] [ERROR] > [snap_schedule.fs.schedule_client] create_scheduled_snapshot raised an > exception: > 2022-02-10 22:00:01,630 [Thread-4] [ERROR] > [snap_schedule.fs.schedule_client] Traceback (most recent call last): > File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py", line 204, > in create_scheduled_snapshot > fs_handle.mkdir(snap_name, 0o755) > File "cephfs.pyx", line 1023, in cephfs.LibCephFS.mkdir > cephfs.ObjectNotFound: error in mkdir > /mnt/cephfs/.snap/scheduled-2022-02-10-22_00_00: No such file or directory > [Errno 2] > > Version-Release number of selected component (if applicable): > ceph versions > { > "mon": { > "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) > pacific (stable)": 3 > }, > "mgr": { > "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) > pacific (stable)": 2 > }, > "osd": { > "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) > pacific (stable)": 9 > }, > "mds": { > "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) > pacific (stable)": 3 > }, > "overall": { > "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) > pacific (stable)": 17 > } > } > > > > How reproducible: > > > Steps to Reproduce: > ceph cluster set up on EST (or any TZ other than GMT) > [root@ceph-node02 129d7636-892c-11ec-86d1-52540093b6a2]# date > Fri Feb 11 13:55:49 EST 2022 > > > 1. enable snap-schedule on ceph5 manager in chephadm shell and enable debug > logging > > [ceph: root@ceph-node02 /]# ceph mgr module enable snap_schedule > [ceph: root@ceph-node02 /]# ceph config set mgr mgr/snap_schedule/log_level > debug > [ceph: root@ceph-node02 /]# ceph config set mgr > mgr/snap_schedule/log_to_file true > > 2. mount cephfs on a client machine, enable snapshots and activate > [root@ceph-client ~]# ceph fs snap-schedule add /mnt/cephfs 1h Is the file system mounted on /mnt/cephfs? If that's true, and you want to schedule snapshots for root (/), you'd need to use: ceph fs snap-schedule add / 1h /mnt/cephfs is a point point which is on the client side. It's not a path in CephFS and that's the reason you see: /mnt/cephfs/.snap/scheduled-2022-02-10-22_00_00: No such file or directory > [root@ceph-client ~]# ceph fs snap-schedule activate /mnt/cephfs/ > > 3. watch the snap-schedule log on the active mgr host > e.g. > /var/log/ceph/129d7636-892c-11ec-86d1-52540093b6a2/ceph-mgr.ceph-node02. > piglqc.snap_schedule.log > > Actual results: > > snapshots failed, but logs indicate that the 17:00 EST snapshot would have > been named scheduled-2022-02-10-22_00_00 > > root@ceph-client .snap]# date > Thu Feb 10 17:10:20 EST 2022 > root@ceph-client .snap]# date -u > Thu Feb 10 22:12:22 UTC 2022 > > > > 2022-02-10 22:00:01,571 [Thread-4] [DEBUG] [mgr_util] Connection to cephfs > 'cephfs' complete > 2022-02-10 22:00:01,574 [Thread-4] [DEBUG] [mgr_util] [put] connection: > <mgr_util.CephfsConnectionPool.Connection object at 0x7f6ffea149e8> usage: 1 > 2022-02-10 22:00:01,574 [Thread-4] [ERROR] > [snap_schedule.fs.schedule_client] create_scheduled_snapshot raised an > exception: > 2022-02-10 22:00:01,630 [Thread-4] [ERROR] > [snap_schedule.fs.schedule_client] Traceback (most recent call last): > File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py", line 204, > in create_scheduled_snapshot > fs_handle.mkdir(snap_name, 0o755) > File "cephfs.pyx", line 1023, in cephfs.LibCephFS.mkdir > cephfs.ObjectNotFound: error in mkdir > /mnt/cephfs/.snap/scheduled-2022-02-10-22_00_00: No such file or directory > [Errno 2] > > > > Expected results: > > 17:00 snapshot named scheduled-2022-02-10-17_00_00 or at least have a > reference in the name as far as the TZ it represents > e.g. > scheduled-2022-02-10-22_00_00Z = 22:00 GMT/ZULU > > Additional info: > I do also find this BZ regarding the container timezone settings which was > closed as 'wontfix' > > https://bugzilla.redhat.com/show_bug.cgi?id=2023169 From my small ceph lab setup
all nodes are on EST - including current active mgr host
[root@ceph-client .snap]# ceph -s
ssh ceph-node02 date;date - cluster:
id: 129d7636-892c-11ec-86d1-52540093b6a2
health: HEALTH_WARN
nodeep-scrub flag(s) set
services:
mon: 3 daemons, quorum ceph-node01,ceph-node02,ceph-node03 (age 0.434246s)
mgr: ceph-node02.piglqc(active, since 40h), standbys: ceph-node01.khvomr
mds: 2/2 daemons up, 1 standby
osd: 9 osds: 9 up (since 40h), 9 in (since 6d)
flags nodeep-scrub
data:
volumes: 2/2 healthy
pools: 6 pools, 193 pgs
objects: 51 objects, 101 KiB
usage: 125 MiB used, 90 GiB / 90 GiB avail
pgs: 193 active+clean
[root@ceph-client .snap]# ssh ceph-node02 "date;date -u"
Wed Feb 16 09:06:48 EST 2022
Wed Feb 16 14:06:48 UTC 2022
[root@ceph-node02 ~]# cephadm shell -- date
Inferring fsid 129d7636-892c-11ec-86d1-52540093b6a2
Inferring config /var/lib/ceph/129d7636-892c-11ec-86d1-52540093b6a2/mon.ceph-node02/config
Using recent ceph image quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a77fa32d0b903061
Wed Feb 16 14:11:09 UTC 2022
Yes, creating the snapshot on / instead of /mnt/cephfs worked this morning and exactly shows what I was trying to explain.
my 9:00AM snapshot got created with the timestamp of 14:00
[root@ceph-client .snap]# date
Wed Feb 16 09:05:18 EST 2022
[root@ceph-client ceph]# cd /mnt/cephfs/.snap
[root@ceph-client .snap]# ls -la
total 0
drwxr-xr-x 2656038672 ceph ceph 14753811159892811571 Feb 11 10:01 .
drwxr-xr-x. 2 ceph ceph 24 Feb 11 10:01 ..
drwxr-xr-x. 2 ceph ceph 24 Feb 11 10:01 scheduled-2022-02-16-14_00_01
End customer's main complaint here is that if he ever needs to go back to look for something in a snapshot,
e.g. data that changed on Feb 16th at 10:05AM , he cannot look at the snapshot labeled scheduled-2022-02-16-11_00_01
but at least in his timezone (CET) it would be in the snapshot labeled scheduled-2022-02-16-10_00_01
Even more so for my test cluster, to recover data from a 9:05AM change I would need to look at the scheduled-2022-02-16-15_00_01 snapshot
Possible solution :
src/pybind/mgr/snap_schedule/fs/schedule_client.py :
Line 28: SNAPSHOT_TS_FORMAT = '%Y-%m-%d-%H_%M_%S'
Add the %Z format sequence to append the TZ
SNAPSHOT_TS_FORMAT = '%Y-%m-%d-%H_%M_%S_%Z'
Or make this a configurable option?
(In reply to George Law from comment #3) > [root@ceph-client .snap]# ssh ceph-node02 "date;date -u" > Wed Feb 16 09:06:48 EST 2022 > Wed Feb 16 14:06:48 UTC 2022 > > [root@ceph-node02 ~]# cephadm shell -- date > Inferring fsid 129d7636-892c-11ec-86d1-52540093b6a2 > Inferring config > /var/lib/ceph/129d7636-892c-11ec-86d1-52540093b6a2/mon.ceph-node02/config > Using recent ceph image > quay.io/ceph/ceph@sha256: > a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a77fa32d0b903061 > Wed Feb 16 14:11:09 UTC 2022 So this env has UTC a default tz and that's why you get snapshot names stamped with UTC time. > > > > Yes, creating the snapshot on / instead of /mnt/cephfs worked this morning > and exactly shows what I was trying to explain. > > my 9:00AM snapshot got created with the timestamp of 14:00 > > > [root@ceph-client .snap]# date > Wed Feb 16 09:05:18 EST 2022 > [root@ceph-client ceph]# cd /mnt/cephfs/.snap > [root@ceph-client .snap]# ls -la > total 0 > drwxr-xr-x 2656038672 ceph ceph 14753811159892811571 Feb 11 10:01 . > drwxr-xr-x. 2 ceph ceph 24 Feb 11 10:01 .. > drwxr-xr-x. 2 ceph ceph 24 Feb 11 10:01 > scheduled-2022-02-16-14_00_01 > > > > End customer's main complaint here is that if he ever needs to go back to > look for something in a snapshot, > e.g. data that changed on Feb 16th at 10:05AM , he cannot look at the > snapshot labeled scheduled-2022-02-16-11_00_01 > but at least in his timezone (CET) it would be in the snapshot labeled > scheduled-2022-02-16-10_00_01 > > Even more so for my test cluster, to recover data from a 9:05AM change I > would need to look at the scheduled-2022-02-16-15_00_01 snapshot > > Possible solution : > src/pybind/mgr/snap_schedule/fs/schedule_client.py : > > Line 28: SNAPSHOT_TS_FORMAT = '%Y-%m-%d-%H_%M_%S' > > Add the %Z format sequence to append the TZ > > SNAPSHOT_TS_FORMAT = '%Y-%m-%d-%H_%M_%S_%Z' > > Or make this a configurable option? Sounds reasonable to me. I'll create an upstream tracker for this. Snapshot schedule is getting updated with time zone details
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph versions
{
"mon": {
"ceph version 16.2.8-31.el8cp (987e514460fd87ce0ea9f17fb81b8e2338a44215) pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.8-31.el8cp (987e514460fd87ce0ea9f17fb81b8e2338a44215) pacific (stable)": 2
},
"osd": {
"ceph version 16.2.8-31.el8cp (987e514460fd87ce0ea9f17fb81b8e2338a44215) pacific (stable)": 12
},
"mds": {
"ceph version 16.2.8-31.el8cp (987e514460fd87ce0ea9f17fb81b8e2338a44215) pacific (stable)": 3
},
"overall": {
"ceph version 16.2.8-31.el8cp (987e514460fd87ce0ea9f17fb81b8e2338a44215) pacific (stable)": 20
}
}
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# date -u
Sun Jun 5 16:46:11 UTC 2022
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# date
Sun Jun 5 12:46:17 EDT 2022
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs snap-schedule add / 1h 2022-06-05T12:49:00
Schedule set for path /
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs snap-schedule activate /
Schedule activated for path /
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# date
Sun Jun 5 12:51:52 EDT 2022
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ls -lrta
total 2
drwxr-xr-x. 6 root root 1212416040 Jun 3 12:44 scheduled-2022-06-05-16_49_00_UTC
drwxr-xr-x. 6 root root 1212416040 Jun 3 12:44 ..
drwxr-xr-x. 2 root root 0 Jun 3 12:44 .
----------------------------------------------------------------------------------------------------------------------------------------------------------
Query :
Snapshot schedule on subvolume is not working if we set it to ceph fs subvolume getpath output
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs subvolumegroup create cephfs subvolgroup_1
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs subvolume create cephfs subvol_1 --group_name subvolgroup_1
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs subvolume getpath cephfs subvol_1 subvolgroup_1
/volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# date
Sun Jun 5 12:53:17 EDT 2022
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs snap-schedule add /volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16 1h 2022-06-05T12:56:00
Schedule set for path /volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs snap-schedule activate /volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16
Schedule activated for path /volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16
Log error :
2022-06-05 16:56:00,780 [Thread-10] [DEBUG] [mgr_util] [put] connection: <mgr_util.CephfsConnectionPool.Connection object at 0x7f8a1243b2e8> usage: 1
2022-06-05 16:56:00,780 [Thread-10] [ERROR] [snap_schedule.fs.schedule_client] create_scheduled_snapshot raised an exception:
2022-06-05 16:56:00,781 [Thread-10] [ERROR] [snap_schedule.fs.schedule_client] Traceback (most recent call last):
File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py", line 285, in create_scheduled_snapshot
fs_handle.mkdir(snap_name, 0o755)
File "cephfs.pyx", line 1023, in cephfs.LibCephFS.mkdir
cephfs.PermissionError: error in mkdir /volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16/.snap/scheduled-2022-06-05-16_56_00_UTC: Operation not permitted [Errno 1]
When tried setting on below it worked
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# date
Sun Jun 5 12:57:47 EDT 2022
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs snap-schedule add /volumes/subvolgroup_1/subvol_1/ 1h 2022-06-05T13:00:00
Schedule set for path /volumes/subvolgroup_1/subvol_1/
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ceph fs snap-schedule activate /volumes/subvolgroup_1/subvol_1/
Schedule activated for path /volumes/subvolgroup_1/subvol_1/
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# pwd
/mnt/cephfs_fuselqi3urty0k/volumes/subvolgroup_1/subvol_1/160f4832-e6de-427d-8a40-82bd137f4f16/.snap
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# cd ..
[root@ceph-amk-bz-2-v3o9mx-node7 160f4832-e6de-427d-8a40-82bd137f4f16]# cd ..
[root@ceph-amk-bz-2-v3o9mx-node7 subvol_1]# ls -lrt
total 1
drwxr-xr-x. 2 root root 44 Jun 5 12:55 160f4832-e6de-427d-8a40-82bd137f4f16
[root@ceph-amk-bz-2-v3o9mx-node7 subvol_1]# ls -lrta
total 2
drwxr-xr-x. 3 root root 176 Jun 5 12:53 ..
drwxr-xr-x. 3 root root 176 Jun 5 12:53 .
-rw-r-----. 1 root root 132 Jun 5 12:53 .meta
drwxr-xr-x. 2 root root 44 Jun 5 12:55 160f4832-e6de-427d-8a40-82bd137f4f16
[root@ceph-amk-bz-2-v3o9mx-node7 subvol_1]# cd .snap
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# ls -lrt
total 1
drwxr-xr-x. 3 root root 176 Jun 5 12:53 scheduled-2022-06-05-17_00_00_UTC
[root@ceph-amk-bz-2-v3o9mx-node7 .snap]# cd scheduled-2022-06-05-17_00_00_UTC/
[root@ceph-amk-bz-2-v3o9mx-node7 scheduled-2022-06-05-17_00_00_UTC]# ls -lrt
total 1
drwxr-xr-x. 2 root root 44 Jun 5 12:55 160f4832-e6de-427d-8a40-82bd137f4f16
[root@ceph-amk-bz-2-v3o9mx-node7 scheduled-2022-06-05-17_00_00_UTC]# cd 160f4832-e6de-427d-8a40-82bd137f4f16/
[root@ceph-amk-bz-2-v3o9mx-node7 160f4832-e6de-427d-8a40-82bd137f4f16]# ls -lrt
total 1
-rw-r--r--. 1 root root 44 Jun 5 12:55 testing_snap_scedule.txt
[root@ceph-amk-bz-2-v3o9mx-node7 160f4832-e6de-427d-8a40-82bd137f4f16]#
Is this expected and Do we need to give till subvol name only in the path ?
Milind, PTAL at Amarnath's query. Cheers, Venky ok @mchangir Thanks for the info Regards, Amarnath Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5997 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |