Bug 861015
Summary: | Self-heal daemon referring to null gfid's | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | spandura | ||||
Component: | glusterfs | Assignee: | Pranith Kumar K <pkarampu> | ||||
Status: | CLOSED ERRATA | QA Contact: | spandura | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 2.0 | CC: | grajaiya, laurent.chouinard, rfortier, rhs-bugs, shaines, surs, vbellur | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.4.0.4rhs-1 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-09-23 22:33:26 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Brick1 log messages at the same time of self-heal:- -------------------------------------------------- [2012-09-26 11:54:06.498683] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 80: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.514790] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 81: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.515144] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 82: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.515458] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 83: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.515766] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 84: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.516079] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 85: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.516873] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 88: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) [2012-09-26 11:54:06.517170] I [server3_1-fops.c:823:server_getxattr_cbk] 0-dist-rep-rhevh-server: 89: GETXATTR (null) (glusterfs.gfid2path) ==> -1 (No such file or directory) Steps to recreate inode-link failures: 1) Create a replicate volume. 2) Bring one of the bricks down. 3) create some files: for i in {1..10}; do dd if=/dev/zero of=$i bs=1M count=10; done 4) execute gluster volume heal info Steps to recreate getxattr failures: [2012-10-16 16:02:41.822320] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-r2-client-1: remote operation failed: No such file or directory. Path: <gfid:4d221ab3-cacd-4971-91bf-d37d71eefb2c> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path 1) Create a replicate volume. 2) Bring one of the bricks down. 3) create some files: for i in {1..10}; do dd if=/dev/zero of=$i bs=1M count=10; done 4) Delete the files from the mount point. 5) execute gluster volume heal info CHANGE: http://review.gluster.org/4097 (protocols: Suppress getxattr log when errno is ENOENT) merged in master by Vijay Bellur (vbellur) CHANGE: http://review.gluster.org/4090 (cluster/afr: Link inode only on lookup) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4399 (Tests: Added function to get pending heal count from heal-info) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4400 (Tests: functions for shd statedump, child_up_status) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4401 (self-heald basic tests) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4402 (Test to check if inode-link failures appear) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4098 (self-heald: Remove stale index even in heal info) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4408 (Tests: Add utils to get index-path, index-count) merged in master by Anand Avati (avati) CHANGE: http://review.gluster.org/4409 (Tests: Check that stale indices are removed on heal-info) merged in master by Anand Avati (avati) Verified fix on the build by executing the steps as mentioned in Comment 4: root@king [Jul-09-2013-17:00:10] >rpm -qa | grep glusterfs glusterfs-fuse-3.4.0.12rhs.beta3-1.el6rhs.x86_64 glusterfs-geo-replication-3.4.0.12rhs.beta3-1.el6rhs.x86_64 glusterfs-3.4.0.12rhs.beta3-1.el6rhs.x86_64 glusterfs-server-3.4.0.12rhs.beta3-1.el6rhs.x86_64 glusterfs-rdma-3.4.0.12rhs.beta3-1.el6rhs.x86_64 glusterfs-debuginfo-3.4.0.12rhs.beta3-1.el6rhs.x86_64 glusterfs-devel-3.4.0.12rhs.beta3-1.el6rhs.x86_64 root@king [Jul-09-2013-17:00:15] > root@king [Jul-09-2013-17:00:15] > root@king [Jul-09-2013-17:00:17] > root@king [Jul-09-2013-17:00:17] >gluster --version glusterfs 3.4.0.12rhs.beta3 built on Jul 6 2013 14:35:18 Bug is fixed. Verified the bug on build with the steps specified in comment 4: ================================================================= root@king [Jul-23-2013-18:05:29] >rpm -qa | grep glusterfs-server glusterfs-server-3.3.0.11rhs-1.el6rhs.x86_64 root@king [Jul-23-2013-18:05:38] > root@king [Jul-23-2013-18:05:39] >gluster --version glusterfs 3.3.0.11rhs built on Jul 3 2013 05:17:12 Bug is fixed on the above build too. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |
Created attachment 617990 [details] glustershd log file. Description of problem: ------------------------- In the self-heal daemon log messages we see "inode link failed on the inode (00000000-0000-0000-0000-000000000000)" when self-heal daemon is self-healing Virtual Machines. Version-Release number of selected component (if applicable): ------------------------------------------------------------ [root@rhs-client6 ~]# gluster --version glusterfs 3.3.0rhsvirt1 built on Sep 25 2012 14:53:06 [root@rhs-client6 ~]# rpm -qa | grep gluster glusterfs-3.3.0rhsvirt1-6.el6rhs.x86_64 glusterfs-rdma-3.3.0rhsvirt1-6.el6rhs.x86_64 vdsm-gluster-4.9.6-14.el6rhs.noarch gluster-swift-plugin-1.0-5.noarch gluster-swift-container-1.4.8-4.el6.noarch org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch glusterfs-fuse-3.3.0rhsvirt1-6.el6rhs.x86_64 glusterfs-geo-replication-3.3.0rhsvirt1-6.el6rhs.x86_64 gluster-swift-proxy-1.4.8-4.el6.noarch gluster-swift-account-1.4.8-4.el6.noarch gluster-swift-doc-1.4.8-4.el6.noarch glusterfs-server-3.3.0rhsvirt1-6.el6rhs.x86_64 gluster-swift-1.4.8-4.el6.noarch gluster-swift-object-1.4.8-4.el6.noarch Steps to Reproduce: -------------------- 1. Create distribute replicate volume (2 x 2) . 4 servers, one brick on each server. Brick details: brick1:- rhs-client6.lab.eng.blr.redhat.com:/disk1 brick2:- rhs-client7.lab.eng.blr.redhat.com:/disk1 brick3:- rhs-client8.lab.eng.blr.redhat.com:/disk1 brick4:- rhs-client9.lab.eng.blr.redhat.com:/disk1 2. Powered OFF hosts rhs-client7.lab.eng.blr.redhat.com and rhs-client9.lab.eng.blr.redhat.com 3. Created new VM's from RHEVM. 4. Powered ON hosts rhs-client7.lab.eng.blr.redhat.com and rhs-client9.lab.eng.blr.redhat.com 5. self-heal daemon starts self-healing newly created VM's on to bricks "brick2" and "brick4" Actual results: --------------- [2012-09-26 11:52:00.358377] I [client-handshake.c:1614:select_server_supported_programs] 0-replicate-rhevh-client-1: Using Program GlusterFS 3.3.0rhsvirt1, Num (1298437), Version (330) [2012-09-26 11:52:00.358754] I [client-handshake.c:1411:client_setvolume_cbk] 0-replicate-rhevh-client-1: Connected to 10.70.36.31:24012, attached to remote volume '/disk2'. [2012-09-26 11:52:00.358792] I [client-handshake.c:1423:client_setvolume_cbk] 0-replicate-rhevh-client-1: Server and Client lk-version numbers are not same, reopening the fds [2012-09-26 11:52:00.361151] I [client-handshake.c:453:client_set_lk_version_cbk] 0-replicate-rhevh-client-1: Server lk version = 1 [2012-09-26 11:52:00.363141] I [client-handshake.c:1614:select_server_supported_programs] 0-dist-rep-rhevh-client-3: Using Program GlusterFS 3.3.0rhsvirt1, Num (1298437), Version (330) [2012-09-26 11:52:00.363486] I [client-handshake.c:1411:client_setvolume_cbk] 0-dist-rep-rhevh-client-3: Connected to 10.70.36.33:24011, attached to remote volume '/disk1'. [2012-09-26 11:52:00.363510] I [client-handshake.c:1423:client_setvolume_cbk] 0-dist-rep-rhevh-client-3: Server and Client lk-version numbers are not same, reopening the fds [2012-09-26 11:52:00.364092] I [client-handshake.c:453:client_set_lk_version_cbk] 0-dist-rep-rhevh-client-3: Server lk version = 1 [2012-09-26 11:54:06.497560] E [afr-self-heald.c:685:_link_inode_update_loc] 0-dist-rep-rhevh-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000) [2012-09-26 11:54:06.498099] E [afr-self-heald.c:685:_link_inode_update_loc] 0-dist-rep-rhevh-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000) [2012-09-26 11:54:06.498829] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-dist-rep-rhevh-client-0: remote operation failed: No such file or directory. Path: <gfid:0c27dbce-46e8-4ad3-8ba8-7ad94ebb47ae> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path [2012-09-26 11:54:06.514905] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-dist-rep-rhevh-client-0: remote operation failed: No such file or directory. Path: <gfid:6b498872-7387-47b2-a4d7-02eeb8c23c99> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path [2012-09-26 11:54:06.515234] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-dist-rep-rhevh-client-0: remote operation failed: No such file or directory. Path: <gfid:17146af7-a6a0-4b59-8063-5572d76aa8e6> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path [2012-09-26 11:54:06.515537] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-dist-rep-rhevh-client-0: remote operation failed: No such file or directory. Path: <gfid:c68681c4-7f15-43c8-aa99-a860482ab5a7> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path [2012-09-26 11:54:06.515860] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-dist-rep-rhevh-client-0: remote operation failed: No such file or directory. Path: <gfid:5725b541-26a3-4e4f-aeca-0ed974c3209e> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path [2012-09-26 11:54:06.516152] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-dist-rep-rhevh-client-0: remote operation failed: No such file or directory. Path: <gfid:306d4115-c5d7-4ff2-a32c-fd2bae17016b> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path [2012-09-26 11:54:06.516426] E [afr-self-heald.c:685:_link_inode_update_loc] 0-dist-rep-rhevh-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)