Bug 1044923
Summary: | AFR : self-heal of few files not happening when a AWS EC2 Instance is back online after a restart | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | spandura | |
Component: | glusterfs | Assignee: | Pranith Kumar K <pkarampu> | |
Status: | CLOSED ERRATA | QA Contact: | spandura | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 2.1 | CC: | grajaiya, pkarampu, psriniva, ravishankar, spandura, vagarwal, vbellur | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | RHGS 2.1.2 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.4.0.57rhs | Doc Type: | Bug Fix | |
Doc Text: |
Previously, when a client disconnected and reconnected in quick succession there was a possibility of stale locks on the brick which could lead to hangs or failures in the self-heal. This issue is now fixed.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1049932 (view as bug list) | Environment: | ||
Last Closed: | 2014-02-25 08:09:33 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1049932, 1113894 |
Description
spandura
2013-12-19 09:02:57 UTC
Not able to reproduce the bug using steps detailed in the bug description. Tried on a 6 node cluster (2x3 setup) on AWS. Mounted the volume on all 6 nodes via fuse and ran 'create_dirs_files.pl' provided by Shwetha to create files/dirs. When one of the nodes was rebooted, `gluster volume heal info` showed a few files to be healed. After sometime running the command again showed zero files to be healed. Also, manually edited the afr changelogs of one file from the backend to simulate the state of the file in the bug: (1)[root@ip-10-182-130-232 b1]# getfattr -d -m . -e hex file3 # file: file3 trusted.afr.testvol-client-0=0x000000020000000000000000 trusted.afr.testvol-client-1=0x000000020000000000000000 trusted.afr.testvol-client-2=0x000000020000000000000000 trusted.gfid=0xc6c413b40893418da1b57b6891425f06 (2) [root@ip-10-191-207-97 b1-rep1]# getfattr -d -m . -e hex file3 # file: file3 trusted.afr.testvol-client-0=0x0000085a0000000000000000 trusted.afr.testvol-client-1=0x000000000000000000000000 trusted.afr.testvol-client-2=0x000000000000000000000000 trusted.gfid=0xc6c413b40893418da1b57b6891425f06 (3)[root@ip-10-203-61-189 b1-rep2]# getfattr -d -m . -e hex file3 # file: file3 trusted.afr.testvol-client-0=0x0000090a0000000000000000 trusted.afr.testvol-client-1=0x000000900000000000000000 trusted.afr.testvol-client-2=0x000000000000000000000000 trusted.gfid=0xc6c413b40893418da1b57b6891425f06 After this, from the mount point: [root@ip-10-182-130-232 fuse_mnt]# cat file3 This healed the file from source node (3) into sink nodes (1) and (2), according to the heal algorithm. Hi Shwetha, Can you try to reproduce the issue again? I am not able to hit it my my setup. Thanks, Ravi Requesting a repro of the bug. Here is what happened: Ping-timer expired for brick for one of the bricks: [2013-12-17 12:12:09.679091] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-vol_dis_rep-client-3: server 10.225.10.102:49152 has not responded in the last 42 seconds, disconnecting. After some time it re-connected and opened "7" fds, all these 7 files will get affected in a bit, wait and watch :-). [2013-12-17 12:12:35.765071] I [client-handshake.c:1308:client_post_handshake] 0-vol_dis_rep-client-3: 7 fds open - Delaying child_up until they are re-opened [2013-12-17 12:12:35.821908] I [client-handshake.c:930:client_child_up_reopen_done] 0-vol_dis_rep-client-3: last fd open'd/lock-self-heal'd - notifying CHILD-UP There was one more ping timer expiry on the same brick: [2013-12-17 12:14:00.742751] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-vol_dis_rep-client-3: server 10.225.10.102:49152 has not responded in the last 42 seconds, disconnecting. This time when the re-open happens it says 'same lk-versions' so no need to re-open the fds. [2013-12-17 12:14:00.770163] I [client-handshake.c:1474:client_setvolume_cbk] 0-vol_dis_rep-client-3: Connected to 10.225.10.102:49152, attached to remote volume '/rhs/bricks/b2'. [2013-12-17 12:14:00.770183] I [client-handshake.c:1495:client_setvolume_cbk] 0-vol_dis_rep-client-3: Server and Client lk-version numbers are same, no need to reopen the fds This means that the client re-connected to server before the previous resource table for the connection (open-fds, locks acquired) is destroyed. And worse it did not even open the files. After this we see these logs: [2013-12-17 12:15:25.791261] E [afr-self-heal-data.c:1158:afr_sh_data_fstat_cbk] 0-vol_dis_rep-replicate-1: /user22/TestDir1/file4: fstat failed on vol_dis_rep-client-3, reason File descriptor in bad state [2013-12-17 12:15:25.863644] I [afr-lk-common.c:676:afr_unlock_inodelk_cbk] 0-vol_dis_rep-replicate-1: /user21/TestDir1/file4: unlock failed on subvolume vol_dis_rep-client-3 with lock owner 800e94044f7f0000 This means that while self-heal was in progress the re-connection happened and as we saw earlier the resource-table was not destroyed. So what happened was that self-heal on those 7 files started. After it opens the files, acquired the locks it proceeds to get the xattrs/stat structures from the bricks, at this time the previous disconnect happened because of which self-heals failed with EBADFD. Since it can't proceed with self-heal it tried to send unlocks for the locks it acquired as part of self-heal which failed because the fds are in bad state. So the stale locks on the bricks will be present forever. Because of this self-heals from other mounts/self-heal-daemons just can't acquire locks on the file, so they are not being self-healed. Will be sending the fix shortly. Pranith, Can you please verify the doc text for technical accuracy? Looks good to me, approved. Verified the fix on the build "glusterfs 3.4.0.57rhs built on Jan 13 2014 06:59:05" with the steps mentioned in the description . Bug is fixed. Moving the bug to VERIFIED state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html |