Bug 1429347
Summary: | cannot list contents of a snapshot without ganesha restart | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Ram Raja <rraja> | |
Component: | NFS-Ganesha | Assignee: | Jeff Layton <jlayton> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 3.1 | CC: | dang, ffilz, jlayton, kkeithle, mbenjamin, pasik, pdonnell, rperiyas | |
Target Milestone: | rc | Keywords: | Reopened | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1561457 (view as bug list) | Environment: | ||
Last Closed: | 2020-06-24 11:08:25 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1561457 |
Description
Ram Raja
2017-03-06 07:36:25 UTC
(In reply to Ram Raja from comment #0) > 4. List contents of the snapshot. Unable to see the snapshot contents. > $ ls .snap/snap42 > What actually happens here? Do you get an error back or does it just look empty?
Sorry! I wasn't clear. Without the ganesha server restart, the snapshot looks empty even though it isn't.
In Steps to Reproduce:
> 4. List contents of the snapshot. Unable to see the snapshot contents.
> $ ls .snap/snap42
It's just empty.
Listing a particular file/directory within a snapshot works.
$ ls .snap/snap42/file0
.snap/snap99/file
And, now if I again list the contents of the snapshot, I can see the file/directory that I'd particularly listed earlier, other files/directories in the snaphot are still missing.
$ ls .snap/snap42
file0
Ok, another question too -- you need to restart ganesha _and_ unmount/remount the nfs mount? Just remounting isn't sufficient? (In reply to Jeff Layton from comment #3) > Ok, another question too -- you need to restart ganesha _and_ > unmount/remount the nfs mount? Just remounting isn't sufficient? Just remounting isn't sufficient. I initially tried remounting with no success. Then moved onto the restarting ganesha server, after which the client received ESTALE while listing the snapshot folder. So proceeded to unmount/remount the client and could immediately list the snapshot contents. Ok, thanks. That helps narrow it down to a real server-side problem, as it doesn't sound like it's due to missing cache invalidations on the NFS client or anything like that. Hmm, Ganesha mkdir assumes the directory will be empty on creation, so it marks the directory as "populated". Dir_Max = 1 doesn't actually disable caching, it just means a directory with more than 1 entry won't be cached.... Will need to think on this one... Hmm, tried to re-create, but I get an EPERM when trying to create the directory in .snap. Am I missing something to actually enable snapshots? The .snap directory does exist, though it never shows up in ls. (In reply to Frank Filz from comment #7) > Hmm, tried to re-create, but I get an EPERM when trying to create the > directory in .snap. Am I missing something to actually enable snapshots? > > The .snap directory does exist, though it never shows up in ls. Yeah, CephFS's snapshot feature is experimental and is disabled by default. Can you try enabling it by, # ceph mds set allow_new_snaps true --yes-i-really-mean-it Just tested with dirent chunking, and the files show up immediately! (In reply to Frank Filz from comment #9) > Just tested with dirent chunking, and the files show up immediately! Nice! How does dirent chunking solve the issue that you explained in Comment 6? I see quite a few patches in review related to dirent chunking, but I'm unable to connect the dots. Thanks! Ganesha currently has a flag for directories, MDCACHE_DIR_POPULATED that indicates that the dirent cache has been loaded with all entries. This flag is set on directory creation (under the assumption a new directory is empty - so actually the fix for pre-dirent chunking is to just remove this flag setting from creation of a new directory). Chunking readdir actually never sets that flag (since it doesn't really track that it has read the entire directory). The chunking readdir does CHECK that flag (which actually is a problem because it's not being set means the directory will always be invalidated - oops.. fix to patch coming very soon...), but chunking readdir will ALWAYS read the directory if there are no chunks. No entries in a newly created directory means no chunks, thus even though chunking readdir considers the dirent cache valid, an empty cache is effectively always considered invalid. That flag will actually disappear completely once dirent chunking replaces the old scheme completely. Dirent chunking feature in NFS-Ganesha in stable v2.5 branch solves this issue as mentioned in Comment 11. It's recommended to turn off dirent chunking for FSAL_CEPH https://github.com/nfs-ganesha/nfs-ganesha/commit/720b1466a7c982604c24c439e22a4c4d461eed4c#diff-d750277ebcac38a79b101cc0d2ed3f00R60 If we do this , we again run into this issue of not being to list CephFS snapshot contents. Ramakrishnan Periysamy hit this issue after setting, ``` CACHEINODE { Dir_Chunk = 0; Dir_Max = 1; } ``` in the ganesha.conf. He later set, ``` CACHEINIODE{ Dir_Chunk = 1; Dir_Max = 1; } ``` and he could see the snapshot contents. Is there a way to list CephFS snapshot contents with dirent chunking turned off, or should we recommend setting Dir_Chunk=1 to work around the issue? Ramakrishnan hit the issue with NFS-Ganesha v2.5.5 and libcephfs, Ceph v12.2.4 So, this is not a problem with 2.6, as the old dirent caching has been removed, so disabling dir_chunk just disables caching entirely, fixing this problem. Unfortunately, the only way to disable the old dirent caching is to either enable dirent chunking, or (per directory) have a directory with more entries than Dir_Max. Dir_Max cannot be 0, so it will only trigger in directories with at least 2 dirents (not counting "." and ".."). The "proper" fix for this kind of issue is to have the FSAL send an upcall to invalidate the dirents when a new snapshot is created (as must be done whenever a dirent is created behind Ganesha's back). Maybe a workaround would be to not mark a newly created directory as populated? Dan provided a work around in comment#15 for setups using NFS-Ganesha 2.5.5. The issue should go away with NFS-Ganesha 2.6. |