Description of problem: ======================== In a 6 node (AWS EC2 Instances) cluster with 2 x 3 distribute replicate volume , one of the AWS EC2 Instance is restarted. Creation of files and directories were in progress when the EC2 instance was restarted. When the Instance came back online few files were supposed to be self-healed. But the self-heal didn't happen on those files even after a day. Following are the list of files that needs self-heal =========================================================== root@ip-10-9-187-132 [Dec-19-2013- 8:47:52] >gluster v heal vol_dis_rep info Gathering list of entries to be healed on volume vol_dis_rep has been successful Brick ip-10-9-187-132:/rhs/bricks/b1 Status: Brick is Not connected Number of entries: 0 Brick ip-10-10-42-177:/rhs/bricks/b1-rep1 Number of entries: 0 Brick ip-10-29-149-83:/rhs/bricks/b1-rep2 Number of entries: 0 Brick ip-10-225-10-102:/rhs/bricks/b2 Number of entries: 7 <gfid:bf68238c-e83d-42b3-990a-883e4bc89231> <gfid:7e08033b-21b6-43c8-9e24-6f4acbd96a64> <gfid:0d60596a-366b-4b4a-bdc8-6e8e271108a9> <gfid:d59a4851-81d6-4717-8626-412029028b51> <gfid:6052d512-c03a-4c06-9b5b-aa21a989bd74> <gfid:b6e141fc-d37f-428c-98bc-2c12db28bf33> <gfid:344588be-789b-42c1-9a49-2b3423204d26> Brick ip-10-118-33-215:/rhs/bricks/b2-rep1 Number of entries: 7 /user28/TestDir1/file2 /user29/TestDir2/file1 /user22/TestDir1/file4 /user27/TestDir1/file3 /user21/TestDir1/file4 /user23/TestDir1/file6 /user25/TestDir1/file5 Brick ip-10-45-178-214:/rhs/bricks/b2-rep2 Number of entries: 7 /user28/TestDir1/file2 /user29/TestDir2/file1 /user22/TestDir1/file4 /user27/TestDir1/file3 /user21/TestDir1/file4 /user23/TestDir1/file6 /user25/TestDir1/file5 root@ip-10-9-187-132 [Dec-19-2013- 8:47:55] > Volume Information ===================== root@ip-10-9-187-132 [Dec-19-2013- 8:48:24] >gluster v info Volume Name: vol_dis_rep Type: Distributed-Replicate Volume ID: ba3ff0a8-d16e-40f9-995b-d0f9525d3e2a Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: ip-10-9-187-132:/rhs/bricks/b1 Brick2: ip-10-10-42-177:/rhs/bricks/b1-rep1 Brick3: ip-10-29-149-83:/rhs/bricks/b1-rep2 Brick4: ip-10-225-10-102:/rhs/bricks/b2 Brick5: ip-10-118-33-215:/rhs/bricks/b2-rep1 Brick6: ip-10-45-178-214:/rhs/bricks/b2-rep2 root@ip-10-9-187-132 [Dec-19-2013- 8:49:36] > Extended attributes of files to be self-healed on Brick4: =========================================================== root@ip-10-225-10-102 [Dec-19-2013- 8:48:47] >cat files_to_heal | while read line ; do getfattr -d -e hex -m . /rhs/bricks/b2/$line ; done getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user28/TestDir1/file2 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0xbf68238ce83d42b3990a883e4bc89231 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user29/TestDir2/file1 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0x7e08033b21b643c89e246f4acbd96a64 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user22/TestDir1/file4 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0x0d60596a366b4b4abdc86e8e271108a9 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user27/TestDir1/file3 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0xd59a485181d647178626412029028b51 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user21/TestDir1/file4 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0x6052d512c03a4c069b5baa21a989bd74 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user23/TestDir1/file6 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0xb6e141fcd37f428c98bc2c12db28bf33 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2//user25/TestDir1/file5 trusted.afr.vol_dis_rep-client-3=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000020000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000020000000000000000 trusted.gfid=0x344588be789b42c19a492b3423204d26 root@ip-10-225-10-102 [Dec-19-2013- 8:49:18] > Extended attributes of files to be self-healed on Brick5: =========================================================== root@ip-10-118-33-215 [Dec-19-2013- 8:48:47] >cat files_to_heal | while read line ; do getfattr -d -e hex -m . /rhs/bricks/b2-rep1/$line ; done getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user28/TestDir1/file2 trusted.afr.vol_dis_rep-client-3=0x0000084d0000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000000000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0xbf68238ce83d42b3990a883e4bc89231 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user29/TestDir2/file1 trusted.afr.vol_dis_rep-client-3=0x0000084c0000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000000000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x7e08033b21b643c89e246f4acbd96a64 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user22/TestDir1/file4 trusted.afr.vol_dis_rep-client-3=0x000008490000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000000000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x0d60596a366b4b4abdc86e8e271108a9 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user27/TestDir1/file3 trusted.afr.vol_dis_rep-client-3=0x0000084f0000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000000000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0xd59a485181d647178626412029028b51 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user21/TestDir1/file4 trusted.afr.vol_dis_rep-client-3=0x000008510000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000000000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x6052d512c03a4c069b5baa21a989bd74 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user23/TestDir1/file6 trusted.afr.vol_dis_rep-client-3=0x0000084e0000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000010000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000010000000000000000 trusted.gfid=0xb6e141fcd37f428c98bc2c12db28bf33 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep1//user25/TestDir1/file5 trusted.afr.vol_dis_rep-client-3=0x0000084a0000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000000000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x344588be789b42c19a492b3423204d26 root@ip-10-118-33-215 [Dec-19-2013- 8:49:18] > Extended attributes of files to be self-healed on Brick6: =========================================================== root@ip-10-45-178-214 [Dec-19-2013- 8:48:47] >cat files_to_heal | while read line ; do getfattr -d -e hex -m . /rhs/bricks/b2-rep2/$line ; done getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user28/TestDir1/file2 trusted.afr.vol_dis_rep-client-3=0x000008560000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0xbf68238ce83d42b3990a883e4bc89231 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user29/TestDir2/file1 trusted.afr.vol_dis_rep-client-3=0x000008550000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x7e08033b21b643c89e246f4acbd96a64 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user22/TestDir1/file4 trusted.afr.vol_dis_rep-client-3=0x000008520000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x0d60596a366b4b4abdc86e8e271108a9 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user27/TestDir1/file3 trusted.afr.vol_dis_rep-client-3=0x000008580000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0xd59a485181d647178626412029028b51 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user21/TestDir1/file4 trusted.afr.vol_dis_rep-client-3=0x0000085a0000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x6052d512c03a4c069b5baa21a989bd74 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user23/TestDir1/file6 trusted.afr.vol_dis_rep-client-3=0x000008570000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0xb6e141fcd37f428c98bc2c12db28bf33 getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2-rep2//user25/TestDir1/file5 trusted.afr.vol_dis_rep-client-3=0x000008530000000000000000 trusted.afr.vol_dis_rep-client-4=0x000000090000000000000000 trusted.afr.vol_dis_rep-client-5=0x000000000000000000000000 trusted.gfid=0x344588be789b42c19a492b3423204d26 root@ip-10-45-178-214 [Dec-19-2013- 8:49:18] > Version-Release number of selected component (if applicable): ============================================================= glusterfs 3.4.0.50rhs built on Dec 16 2013 10:45:13 How reproducible: ================ create_brick.sh =============== #!/bin/bash ################################################################################ # Description : # Creates bricks for the RHS volume using the ephemeral devices "/dev/xvdb" and "/dev/xvdc". ################################################################################ yes | mdadm --create /dev/md0 --level=0 -c256 --raid-devices=2 /dev/xvdb /dev/xvdc echo 'DEVICE /dev/xvdb /dev/xvdc' > /etc/mdadm.conf mdadm --detail --scan >> /etc/mdadm.conf ; echo $? blockdev --setra 65536 /dev/md0 device="/dev/md0" mkfs.xfs -f -i size=512 -n size=8192 $device uuid=`xfs_admin -u $device | cut -f3 -d " "` echo $uuid grep -wq $uuid /etc/fstab > /dev/null 2>&1 && return 1 mkdir -p /rhs/bricks mount=/rhs/bricks/ echo $mount echo "UUID=$uuid $mount xfs inode64,noatime,nodiratime 1 0" >> /etc/fstab cat /etc/fstab | grep xfs mount -a Steps to Reproduce: =================== 1. Create 6 AWS EC2 Instances with m1.large instance , 2 ephemeral devices, enable ports 22, 24007, 49152-49200 . 2. On each of the AWS EC2 Instance run "create_brick.sh" to create bricks. 3. Create 2 x 3 distribute-replicate volume 4. Create fuse mount from each of the EC2 Instance. 5. From each mount point create directories and files. root@ip-10-9-187-132 [Dec-19-2013- 8:57:47] >df -h /mnt/gm1 Filesystem Size Used Avail Use% Mounted on ip-10-9-187-132:/vol_dis_rep 1.7T 681G 999G 41% /mnt/gm1 6. While IO is in progress, restart NODE4 (Node containing brick4). Actual results: ================ There are few files with pending self-heal. Self-heal daemon is not self-healing the files at all. Expected results: ================= Self-heal daemon should self-heal files. Additional info: =================== Peer Status: ============== root@ip-10-9-187-132 [Dec-19-2013- 8:50:34] >gluster peer status Number of Peers: 5 Hostname: ip-10-10-42-177 Uuid: fb07c484-c6b3-4798-978e-33e9bc092db2 State: Peer in Cluster (Connected) Hostname: ip-10-29-149-83 Uuid: 90845012-7719-4aae-b0ce-060864b3b75d State: Peer in Cluster (Connected) Hostname: ip-10-225-10-102 Uuid: 4ad40c07-7370-4a56-9567-e3a51d4a16f0 State: Peer in Cluster (Connected) Hostname: ip-10-118-33-215 Uuid: 9bd9c5f0-85de-4ba1-9421-b1bbd42fe8c3 State: Peer in Cluster (Connected) Hostname: ip-10-45-178-214 Uuid: e47e8de0-e1af-4e8c-ad23-309e2754790c State: Peer in Cluster (Connected) root@ip-10-9-187-132 [Dec-19-2013- 8:50:37] > Volume status: ================ root@ip-10-9-187-132 [Dec-19-2013- 9:00:33] >gluster v status Status of volume: vol_dis_rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick ip-10-9-187-132:/rhs/bricks/b1 49152 Y 31398 Brick ip-10-10-42-177:/rhs/bricks/b1-rep1 49152 Y 7494 Brick ip-10-29-149-83:/rhs/bricks/b1-rep2 49152 Y 8821 Brick ip-10-225-10-102:/rhs/bricks/b2 49152 Y 1474 Brick ip-10-118-33-215:/rhs/bricks/b2-rep1 49152 Y 7147 Brick ip-10-45-178-214:/rhs/bricks/b2-rep2 49152 Y 7206 NFS Server on localhost 2049 Y 31405 Self-heal Daemon on localhost N/A Y 31412 NFS Server on ip-10-225-10-102 2049 Y 1483 Self-heal Daemon on ip-10-225-10-102 N/A Y 1490 NFS Server on ip-10-29-149-83 2049 Y 8835 Self-heal Daemon on ip-10-29-149-83 N/A Y 8842 NFS Server on ip-10-45-178-214 2049 Y 7220 Self-heal Daemon on ip-10-45-178-214 N/A Y 7228 NFS Server on ip-10-118-33-215 2049 Y 7160 Self-heal Daemon on ip-10-118-33-215 N/A Y 7168 NFS Server on ip-10-10-42-177 2049 Y 7507 Self-heal Daemon on ip-10-10-42-177 N/A Y 7515 Task Status of Volume vol_dis_rep ------------------------------------------------------------------------------ There are no active volume tasks root@ip-10-9-187-132 [Dec-19-2013- 9:00:37] >
SOS Reports: http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1044923/
Not able to reproduce the bug using steps detailed in the bug description. Tried on a 6 node cluster (2x3 setup) on AWS. Mounted the volume on all 6 nodes via fuse and ran 'create_dirs_files.pl' provided by Shwetha to create files/dirs. When one of the nodes was rebooted, `gluster volume heal info` showed a few files to be healed. After sometime running the command again showed zero files to be healed. Also, manually edited the afr changelogs of one file from the backend to simulate the state of the file in the bug: (1)[root@ip-10-182-130-232 b1]# getfattr -d -m . -e hex file3 # file: file3 trusted.afr.testvol-client-0=0x000000020000000000000000 trusted.afr.testvol-client-1=0x000000020000000000000000 trusted.afr.testvol-client-2=0x000000020000000000000000 trusted.gfid=0xc6c413b40893418da1b57b6891425f06 (2) [root@ip-10-191-207-97 b1-rep1]# getfattr -d -m . -e hex file3 # file: file3 trusted.afr.testvol-client-0=0x0000085a0000000000000000 trusted.afr.testvol-client-1=0x000000000000000000000000 trusted.afr.testvol-client-2=0x000000000000000000000000 trusted.gfid=0xc6c413b40893418da1b57b6891425f06 (3)[root@ip-10-203-61-189 b1-rep2]# getfattr -d -m . -e hex file3 # file: file3 trusted.afr.testvol-client-0=0x0000090a0000000000000000 trusted.afr.testvol-client-1=0x000000900000000000000000 trusted.afr.testvol-client-2=0x000000000000000000000000 trusted.gfid=0xc6c413b40893418da1b57b6891425f06 After this, from the mount point: [root@ip-10-182-130-232 fuse_mnt]# cat file3 This healed the file from source node (3) into sink nodes (1) and (2), according to the heal algorithm.
Hi Shwetha, Can you try to reproduce the issue again? I am not able to hit it my my setup. Thanks, Ravi
Requesting a repro of the bug.
Here is what happened: Ping-timer expired for brick for one of the bricks: [2013-12-17 12:12:09.679091] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-vol_dis_rep-client-3: server 10.225.10.102:49152 has not responded in the last 42 seconds, disconnecting. After some time it re-connected and opened "7" fds, all these 7 files will get affected in a bit, wait and watch :-). [2013-12-17 12:12:35.765071] I [client-handshake.c:1308:client_post_handshake] 0-vol_dis_rep-client-3: 7 fds open - Delaying child_up until they are re-opened [2013-12-17 12:12:35.821908] I [client-handshake.c:930:client_child_up_reopen_done] 0-vol_dis_rep-client-3: last fd open'd/lock-self-heal'd - notifying CHILD-UP There was one more ping timer expiry on the same brick: [2013-12-17 12:14:00.742751] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-vol_dis_rep-client-3: server 10.225.10.102:49152 has not responded in the last 42 seconds, disconnecting. This time when the re-open happens it says 'same lk-versions' so no need to re-open the fds. [2013-12-17 12:14:00.770163] I [client-handshake.c:1474:client_setvolume_cbk] 0-vol_dis_rep-client-3: Connected to 10.225.10.102:49152, attached to remote volume '/rhs/bricks/b2'. [2013-12-17 12:14:00.770183] I [client-handshake.c:1495:client_setvolume_cbk] 0-vol_dis_rep-client-3: Server and Client lk-version numbers are same, no need to reopen the fds This means that the client re-connected to server before the previous resource table for the connection (open-fds, locks acquired) is destroyed. And worse it did not even open the files. After this we see these logs: [2013-12-17 12:15:25.791261] E [afr-self-heal-data.c:1158:afr_sh_data_fstat_cbk] 0-vol_dis_rep-replicate-1: /user22/TestDir1/file4: fstat failed on vol_dis_rep-client-3, reason File descriptor in bad state [2013-12-17 12:15:25.863644] I [afr-lk-common.c:676:afr_unlock_inodelk_cbk] 0-vol_dis_rep-replicate-1: /user21/TestDir1/file4: unlock failed on subvolume vol_dis_rep-client-3 with lock owner 800e94044f7f0000 This means that while self-heal was in progress the re-connection happened and as we saw earlier the resource-table was not destroyed. So what happened was that self-heal on those 7 files started. After it opens the files, acquired the locks it proceeds to get the xattrs/stat structures from the bricks, at this time the previous disconnect happened because of which self-heals failed with EBADFD. Since it can't proceed with self-heal it tried to send unlocks for the locks it acquired as part of self-heal which failed because the fds are in bad state. So the stale locks on the bricks will be present forever. Because of this self-heals from other mounts/self-heal-daemons just can't acquire locks on the file, so they are not being self-healed. Will be sending the fix shortly.
Pranith, Can you please verify the doc text for technical accuracy?
Looks good to me, approved.
Verified the fix on the build "glusterfs 3.4.0.57rhs built on Jan 13 2014 06:59:05" with the steps mentioned in the description . Bug is fixed. Moving the bug to VERIFIED state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html