| Summary: | Mount point should not be in-accessible between reconnect to server | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Sachidananda Urs <sac> |
| Component: | distribute | Assignee: | Amar Tumballi <amarts> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 3.0.5 | CC: | aavati, anush, exa.exa, gluster.bugs, gluster-bugs, vijay, vraman |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Sacchi, Can you check the same setup with 3.1.0alpha release and see if it works without any changes? I guess we don't need any changes in code base to get this working as we got 'gfid' feature. If it works, can you close the bug? -Amar > Can you check the same setup with 3.1.0alpha release and see if it works
> without any changes? I guess we don't need any changes in code base to get this
> working as we got 'gfid' feature.
>
> If it works, can you close the bug?
I will check and update the bug.
Bring in a feature to block client writes for 10-20 (or for the given option 'reconnection-timeout') seconds if there is no connection with server. This helps in bringing in fail-over features in GlusterFS, without application knowing. PATCH: http://patches.gluster.com/patch/5026 in master (defaults.{c,h}: _resume functions added) PATCH: http://patches.gluster.com/patch/5039 in master (features/quiesce: new translator) quiesce translator has been developed for the same reason.. we will enable it after testing it further post 3.1.0 Is there some good temporary workaround or or patch that would allow nodes to reconnect without receiving 'stale NFS file handle' errors until this fix gets released? thx -mk I get a cannot access /mnt/glusterfs: Stale NFS file handle warning when adding a new node to a cluster.... will this patch help resolve that? (In reply to comment #8) > I get a cannot access /mnt/glusterfs: Stale NFS file handle warning when adding > a new node to a cluster.... will this patch help resolve that? What version are you trying with? Can you see if the latest git head works fine for you? Avati I'm using 3.1.1... will try and see if I can get the git release. Rich the git I've just downloaded is unusable for me... # gluster volume create biostar transport tcp 172.16.0.1:/mnt/storage Creating Volume biostar failed # gluster volume help unrecognized word: help (position 1) biostar glusterd # gluster volume create help Segmentation fault (core dumped) Rich Update: I only get the Stale NFS file handle warning when adding the 1st new node to a cluster. For example, in a distributed node I have node1, node2 and node3. When I create the volume with node1, and then add node2 I get the error, but adding node3 is ok. Likewise if I have a replicated-distributed cluster with node1+node2 and then add node3+node4 I get the error, but when I then add node5+node6 there is no error. Hope this helps, Rich Can you try the latest git head? Some fixes have gone into code which are related to the errors you are facing. It is very much possible your issue has already gotten addressed in the repository code. I've just downloaded the latest git release and its still unusable for me as per #11. I'll keep trying tho, and open another bug report if its still a problem for me. PATCH: http://patches.gluster.com/patch/5648 in master (quiesce: bring in feature to re-transmit the frames) |
Scenario: Setup two servers running on plain distribute and ucarp is setup to run GlusterFS on a backup machine upon failover. Machine A Machine B UCARP | CLIENT (/mnt/gluster) When the Machine A goes down, ucarp starts GlusterFS automatically on Machine B and client reconnects to the volumes on Machine B. But when tried to access mount point, I get `Stale NFS File Handle' error. I have to unmount and remount the client. Note: Machine A and Machine B are connected to same SAN backends.