Bug 1713307
| Summary: | ganesh-nfs didn't failback when writing files on Mac nfs client if the power is shut down | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | guolei <guol-fnst> | ||||
| Component: | ganesha-nfs | Assignee: | Soumya Koduri <skoduri> | ||||
| Status: | CLOSED UPSTREAM | QA Contact: | |||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.1 | CC: | bugs, guol-fnst, jthottan, skoduri | ||||
| Target Milestone: | --- | Keywords: | Triaged | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-03-12 12:48:27 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Can you please collect packet traces from all the machines (Node-1, Node-2 and especially from the client machine) while repeating this test for just that single file (i.e, FileA). After I modified the following parameters, it became ok! server.tcp-user-timeout: 3 client.tcp-user-timeout: 5 Can you explain how it works? May I close this bug ? This bug is moved to https://github.com/gluster/glusterfs/issues/955, and will be tracked there from now on. Visit GitHub issues URL for further details |
Created attachment 1572444 [details] ganesha log Description of problem: We did some failover/failback tests on 3 nodes(Node-1 Node-2 Node-3). The software architecture is "glusterfs +ctdb(public address) + nfs-ganesha". Gluster volume type is replica 3. We used CTDB's floating ip to mount the volume on Mac OS via nfs from Node-1, and wrote file A a to the mountpoint. When the file A was copied to the mountpoint, the power of Node-1 is shut down. The coping process was suspended, however we can copy other files to the mountpoint normally. 20 minutes later, everything became OK, File A resumed being copied. Windows NFS client has the ame behaviors with Mac. But Centos NFS client works very well ,and shows no suspending. Version-Release number of selected component (if applicable): gluster version: 4.1.8 nfs-ganesha version: 2.7.3 Mac client(10.14.0) How reproducible: Steps to Reproduce: 1.create a gluster volume (replica 3), and export it with CTDB+ganesha-nfs 2.Mount the vol on Mac os or Windows via CTDB floating IP.Copy a file to the mountpiont. 3.Shut down the power of the node where the floating IP exists. Actual results: The coping process was suspended, however we can copy other files to the mountpoint normally. 20 minutes later, everything became OK, File A resumed being copied. No matter how many times we try, We must wait for 20 minutes. Expected results: File A can be transferrd in 1 or 2 minutes. Additional info: Here is the ganesha log of Node-2 when the floating ip transferred to Node-2.