| Summary: | glusterfs mountpoint hangs on disconnecting second node | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Andreas Kimpfler <andreas> |
| Component: | replicate | Assignee: | Anand Avati <aavati> |
| Status: | CLOSED WORKSFORME | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 3.1.0 | CC: | amarts, avati, chrisw, gluster-bugs, sanjay.naikwadi, sgowda, tm, vijay, wiedi |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | fuse |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Andreas Kimpfler
2010-11-23 15:40:26 UTC
A friend of mine tried this with four machines: Volume Name: vstore Type: Replicate Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 192.168.123.123:/srv/export Brick2: 192.168.123.124:/srv/export Brick3: 192.168.123.125:/srv/export Brick4: 192.168.123.126:/srv/export Setting up such a scenario and disconnecting Ethernet never falls into the same problem as i have. So we think this is might be a problem of the Quorum. Hi, Glusterfs uses ip addresses. If all the interfaces are down, then the mount-point cannot resolve the bricks (even if it is a local-host, as it does not use 127.0.0.1 addr). Since all the bricks go down, the Ops fail. As for the hang, fixes in 3.1.1 have taken care of it, and after a 42sec timeout, the ops terminate. Once the network is back up, and the bricks are up, the mount point is active again. You do not have to restart all the servers/bricks. When there were 4 bricks/server, since it had access to at least one of the bricks, the ops were successful in that instance. With regards, Shishir Hi, sorry for the late answer. Thx for your explanation, so in my understanding if a 2-node setup looses interface connectivity to the other node it lasts 42 seconds until the mount-point gets active again/is responding ? I ask this because i want to use glusterfs to store images of my kvm virtual machines, maildirs and webhosting stuff on this two machines. Further plans are more machines that hold the same content. If the glusterfs goes down, atm this means a totally unresponding mount-point, virtual machines that are not responding or maybe crashing, imap-servers that can't delive mailbox content and so on. Regards, Andreas Hi Andreas, Recently we fixes some bug similar to this (bug 763737), can you try the same experiment with latest git head (now available at https://github.com/gluster/glusterfs) or wait for few more days to have a QA release with these fixes. Regards, Amar This works fine for us atm with 3.1.3qa2 release. Please see if it fixes issues for you. |