| Summary: | many file related inconsistencies with gnfs | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> |
| Component: | gluster-nfs | Assignee: | Niels de Vos <ndevos> |
| Status: | CLOSED NOTABUG | QA Contact: | surabhi <sbhaloth> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.2 | CC: | amukherj, jthottan, nchilaka, ndevos, rhs-bugs, skoduri, storage-qa-internal |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-02-15 14:31:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Nag Pavan Chilakam
2016-11-17 13:13:57 UTC
hit this bug as part of validation of 1328451 - observing " Too many levels of symbolic links" after adding bricks and then issuing a replace brick To me this also sounds like caching done on the NFS-client, and not something that Gluster/NFS can (or is expected to) fix. When files are deleted on one NFS-client, it is common to see those files for a little longer on other NFS-clients. Mounting with "noac" or dropping the cached dentries and inodes might help in this case (echo 2 > /proc/sys/vm/drop_caches). Otherwise doing the operations on the 2nd NFS-client with some delay may be sufficient too. It is unclear to me if this problem is newly introduced with this particular code change, or if this has existed before (what I expect). Note that "ls" also executes a stat() systemcall by default on RHEL. In order to prevent executing the stat(), you will need to run "/bin/ls" or escape the bash alias by running "\ls". The NFS-client can have dentries cached, causing no new READDIR to be sent to the server. In case the attributes already have expired, only the stat() would be done. Depending on the state of the caches and the changes on the Gluster volume, either ENOENT or ESTALE would get returned. If none of the above hints help, we need a tcpdump captured on the Gluster/NFS server. The capture should include the mounting of the NFS-clients, the NFS traffic and GlusterFS traffic. It would also be helpful to have the rpcdebug output from the NFS-clients. This information makes it possible for us to track the operations returning and done on the (NFS) filehandles and (GlusterFS) GFIDs. If we mount the vol on nfs using option noac then i don't see any of the problems. Hence Looks more of a design limitation which we can live with, only problem will be that application relying on this data can fail. However we can move it to 3.2.0-beyond |