Hide Forgot
> We do not have support of creating snapshots on NFS mounts (BZ : https://bugzilla.redhat.com/show_bug.cgi?id=1974416), due to this the replication of NFS exports to the remote cluster is also not possible. > >I feel the support of replication should be enabled on NFS exports too. It's possible to snapshot the NFS shares but not via the NFS client. You could setup a snapshot schedule instead using the "snap_schedule" module.
Hi Hemanth, I don't see snaps created even with ceph-fuse and kernel. # ceph nfs export create cephfs a vstart /cephfs { "bind": "/cephfs", "fs": "a", "path": "/", "cluster": "vstart", "mode": "RW" } # ceph fs snap-schedule add --path=/ 1h fs=a Schedule set for path / # date Thu Jul 8 10:48:43 AM IST 2021 # ./bin/ceph fs snap-schedule status / {"fs": "a", "subvol": null, "path": "/", "rel_path": "/", "schedule": "1h", "retention": {}, "start": "2021-07-08T00:00:00", "created": "2021-07-08T05:18:38", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true} # mount -t nfs -o port=2049 192.168.0.138:/cephfs /mnt # ls /mnt testfile testfile2 # date Thu Jul 8 11:00:13 AM IST 2021 # ls -a /mnt/.snap . .. # ./bin/ceph fs snap-schedule status / {"fs": "a", "subvol": null, "path": "/", "rel_path": "/", "schedule": "1h", "retention": {}, "start": "2021-07-08T00:00:00", "created": "2021-07-08T05:18:38", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true} # umount /mnt # ./bin/ceph-fuse -m 192.168.0.138:40705 /mnt ceph-fuse[62657]: starting ceph client ceph-fuse[62657]: starting fuse [root@localhost build]# ls /mnt testfile testfile2 [root@localhost build]# ls /mnt/.snap # mount -t ceph 192.168.0.138:40705:/ /mnt -o name=admin,secret=AQDpieZg6KUiIxAAFqgwdEknjmA1/p7XXWSPDw== ]# ls /mnt testfile testfile2 # ls -a /mnt/.snap . ..
I can also see the snapshots now. The steps are similar to Hemanth's. I was able to see them after one hour. The following line in doc[1] caused confusion. ` So when a snapshot schedule with repeat interval 1h is added at 13:50 with the default start time, the first snapshot will be taken at 14:00.` Also this throws error [root@localhost build]# ./bin/ceph config set mgr allow_m_granularity true Error EINVAL: unrecognized config option 'allow_m_granularity' [1] https://docs.ceph.com/en/latest/cephfs/snap-schedule/#usage
(In reply to Varsha from comment #14) > Also this throws error > [root@localhost build]# ./bin/ceph config set mgr allow_m_granularity true > Error EINVAL: unrecognized config option 'allow_m_granularity' Should be: > ceph config set mgr mgr/snap_schedule/allow_m_granularity true
Not sure why we're keeping all of these comments private. I think this is just a case where the metadata (in particular, the mtime and change attribute) on the .snap directory isn't changing. The client did a full readdir earlier and since nothing appears to have changed in the directory inode metadata, it's satisfying readdir out of the cache. This probably requires a fix in libcephfs to ensure that the mtime and change attribute get bumped when a snap is created or deleted.
I thought we specifically use configs in nfs-ganesha to prevent using its dcache?
We do. This is the NFS client's caching. The cache is invalidated if the metadata changes, but otherwise it's trusted. The bottom line is that we're adding to and removing dentries from the .snap/ directory without ensuring that the mtime and change attribute are updated. If we fix the client to update the mtime and change attr when snaps are created and deleted, this should work as expected. Note that I see the same lack of change to the metadata with ceph-fuse too, it's just that it never caches the .snap directory contents because .snap directories are "special". This should probably be transitioned to a libcephfs bug, IMO.
Very well, thanks for explaining Jeff.
Ramana, can you look into this?
(In reply to Patrick Donnelly from comment #20) > I thought we specifically use configs in nfs-ganesha to prevent using its > dcache? We don't by default disable dirent and attribute caching in cephadm deployed NFS-Ganesha. See, https://github.com/ceph/ceph/blob/v16.2.10/src/pybind/mgr/cephadm/templates/services/nfs/ganesha.conf.j2 In ansible managed ganesha.conf that we use for OpenStack Manila, we've the following settings to disabling attribute and dentry caching in NFS-Ganesha, ``` EXPORT_DEFAULTS { Attr_Expiration_Time = 0; } CACHEINODE { Dir_Chunk = 0; NParts = 1; Cache_Size = 1; } ``` For Ceph FSAL, we recommend turning off attribute and dentry caching. See, https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph.conf#L58 and https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph.conf#L97 I don't see the ganesha.conf used for this setup in the BZ. We could have checked whether QE turned off the caching using custom configuration. I also found this old BZ, https://bugzilla.redhat.com/show_bug.cgi?id=1429347 , where listing CephFS snapshots didn't work unless dirent caching was disabled in NFS-Ganesha.
(In reply to Ram Raja from comment #32) > (In reply to Patrick Donnelly from comment #20) > > I thought we specifically use configs in nfs-ganesha to prevent using its > > dcache? > > We don't by default disable dirent and attribute caching in cephadm deployed > NFS-Ganesha. See, > https://github.com/ceph/ceph/blob/v16.2.10/src/pybind/mgr/cephadm/templates/ > services/nfs/ganesha.conf.j2 > > In ansible managed ganesha.conf that we use for OpenStack Manila, we've the > following settings to disabling attribute and dentry caching in NFS-Ganesha, > > ``` > EXPORT_DEFAULTS { > Attr_Expiration_Time = 0; > } > > CACHEINODE { > Dir_Chunk = 0; > NParts = 1; > Cache_Size = 1; > } > ``` > > For Ceph FSAL, we recommend turning off attribute and dentry caching. > See, > https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph. > conf#L58 > and > https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph. > conf#L97 > > > I don't see the ganesha.conf used for this setup in the BZ. We could have > checked whether QE turned off the caching using custom configuration. > > > I also found this old BZ, > https://bugzilla.redhat.com/show_bug.cgi?id=1429347 , where listing CephFS > snapshots didn't work unless dirent caching was disabled in NFS-Ganesha. So attribute caching is disabled at export level instead of at ganesha instance/cluster level by mgr/nfs module. https://github.com/ceph/ceph/pull/41574/commits/283a8c184e748a0af5a5deedc2d80b855d747924 Example of a CephFS NFS export stored in a RADOS object by mgr/nfs module with 'attr_expiration_time = 0' config setting $ ./bin/rados -p .nfs -N nfs-ganesha get export-1 - | grep 'attr_expiration_time' attr_expiration_time = 0; Dirent caching was re-enabled since RGW FSAL relies on NFS-Ganesha's dirent caching for RGW directory listing. See, https://www.spinics.net/lists/dev-ceph/msg03232.html https://github.com/ceph/ceph/pull/41574/commits/9d114e910e953c0adf4ba5d31ab1d17bdfa1f638 We've tested CephFS/NFS-Ganesha only with dirent caching *disabled*. I don't think we investigated the effects of enabling NFS-Ganesha dirent caching on CephFS. This BZ could be one of the side-effects of enabling Ganesha dirent caching. See https://www.spinics.net/lists/dev-ceph/msg03233.html It doesn't look like dirent caching can be disabled per export. This means that a NFS Ganesha instance/cluster with dirent caching disabled can only be exclusively used for CephFS exports.
(In reply to Patrick Donnelly from comment #20) > I thought we specifically use configs in nfs-ganesha to prevent using its > dcache? (In reply to Ram Raja from comment #32) > (In reply to Patrick Donnelly from comment #20) > > I thought we specifically use configs in nfs-ganesha to prevent using its > > dcache? > > We don't by default disable dirent and attribute caching in cephadm deployed > NFS-Ganesha. See, > https://github.com/ceph/ceph/blob/v16.2.10/src/pybind/mgr/cephadm/templates/ > services/nfs/ganesha.conf.j2 Looks like ganesha dirent caching was renabled in pacific only in Sept 2021 [1] and available in release 16.2.7 (release Dec 2021). Before that as Jeff mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1975689#c21, ganesha dirent caching was disabled since 16.1.0 [2]. Most likely QE tested this BZ with ganesha dirent caching disabled in July 2021. [1] https://github.com/ceph/ceph/pull/43075/commits/db053d0056d231a7dd430c6cdb335f95aa5d50e7 [2] https://github.com/ceph/ceph/pull/34382/commits/25f4dedd3e75c81b19911fd33b171e613ab1c559 > > In ansible managed ganesha.conf that we use for OpenStack Manila, we've the > following settings to disabling attribute and dentry caching in NFS-Ganesha, > > ``` > EXPORT_DEFAULTS { > Attr_Expiration_Time = 0; > } > > CACHEINODE { > Dir_Chunk = 0; > NParts = 1; > Cache_Size = 1; > } > ``` > > For Ceph FSAL, we recommend turning off attribute and dentry caching. > See, > https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph. > conf#L58 > and > https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph. > conf#L97 > > > I don't see the ganesha.conf used for this setup in the BZ. We could have > checked whether QE turned off the caching using custom configuration. > > > I also found this old BZ, > https://bugzilla.redhat.com/show_bug.cgi?id=1429347 , where listing CephFS > snapshots didn't work unless dirent caching was disabled in NFS-Ganesha.