Bug 1244197 - gluster-DHT: Split brain issue and skipped file issues are observer during rebalance while IO and lookup was happening
Summary: gluster-DHT: Split brain issue and skipped file issues are observer during re...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Pranith Kumar K
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard: Rebalance
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-17 12:53 UTC by Triveni Rao
Modified: 2018-12-11 10:17 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-29 10:12:18 UTC
Embargoed:


Attachments (Terms of Use)

Description Triveni Rao 2015-07-17 12:53:58 UTC
Problem Description:
====================

gluster-DHT: Split brain issue and skipped file issues are observer during rebalance while IO and lookup was happening.

Version:
========

[root@rhsqa14-vm1 glusterfs]# rpm -qa | grep gluster
gluster-nagios-common-0.2.0-1.el6rhs.noarch
gluster-nagios-addons-0.2.4-2.el6rhs.x86_64
glusterfs-3.7.1-10.el6rhs.x86_64
glusterfs-cli-3.7.1-10.el6rhs.x86_64
glusterfs-libs-3.7.1-10.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-10.el6rhs.x86_64
glusterfs-fuse-3.7.1-10.el6rhs.x86_64
glusterfs-server-3.7.1-10.el6rhs.x86_64
glusterfs-rdma-3.7.1-10.el6rhs.x86_64
vdsm-gluster-4.16.20-1.2.el6rhs.noarch
python-gluster-3.7.1-8.el6rhs.x86_64
glusterfs-api-3.7.1-10.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-10.el6rhs.x86_64
[root@rhsqa14-vm1 glusterfs]# 


Steps to reproduce:
=====================
1. create distrep volume 2x2
2. mount the same volume with fuse and nfs protocol.
3. Run untarring of kernel on fuse.
4. run lookup on nfs mount like find <file_name> in loop
5. start adding bricks continuously one after the other.like addbrick b1 b1
addbrick b2 b2 addbrick b3 b3
6. bring down one of the bricks from replica pair.
7. start rebalance force.
8. observe the rebalance status and lookup and untarring.

Actual result:
====================
1. Skipped files observed on rebalance status even though start force was used
2. Spli-brain messages were seen rebalance log even though the slpit brain has not happened on the cluster.
3. Some permission denied messages were see in rebalance logs.
4. rebalance was successful with some skipped files.


Expected result:
=================
all above should not have happened.

Output:
===========


[root@rhsqa14-vm1 ~]# gluster v status
Status of volume: test
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.165:/rhs/brick1/t1           49152     0          Y       5626
Brick 10.70.47.163:/rhs/brick1/t1           49152     0          Y       7959
Brick 10.70.47.165:/rhs/brick2/t1           49153     0          Y       5644
Brick 10.70.47.163:/rhs/brick2/t1           49153     0          Y       7977
Snapshot Daemon on localhost                49154     0          Y       30947
NFS Server on localhost                     2049      0          Y       30956
Self-heal Daemon on localhost               N/A       N/A        Y       5670
Quota Daemon on localhost                   N/A       N/A        Y       30863
Snapshot Daemon on 10.70.47.163             49154     0          Y       9798
NFS Server on 10.70.47.163                  2049      0          Y       9806
Self-heal Daemon on 10.70.47.163            N/A       N/A        Y       8003
Quota Daemon on 10.70.47.163                N/A       N/A        Y       9764

Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks

[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# gluster v add-brick test 10.70.47.165:/rhs/brick2/t1 10.70.47.163:/rhs/brick2/t1
volume add-brick: failed: Brick: 10.70.47.165:/rhs/brick2/t1 not available. Brick may be containing or be contained by an existing brick
[root@rhsqa14-vm1 ~]# gluster v add-brick test 10.70.47.165:/rhs/brick3/t1 10.70.47.163:/rhs/brick3/t1
volume add-brick: success
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# gluster v add-brick test 10.70.47.165:/rhs/brick4/t1 10.70.47.163:/rhs/brick4/t1
volume add-brick: success
[root@rhsqa14-vm1 ~]# gluster v add-brick test 10.70.47.165:/rhs/brick5/t1 10.70.47.163:/rhs/brick5/t1
volume add-brick: success
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# gluster v info
Volume Name: test
Type: Distributed-Replicate
Volume ID: f6b7bce1-4e4f-47e4-bf94-193ec7689b18
Status: Started
Number of Bricks: 5 x 2 = 10
Transport-type: tcp   
Bricks:
Brick1: 10.70.47.165:/rhs/brick1/t1
Brick2: 10.70.47.163:/rhs/brick1/t1
Brick3: 10.70.47.165:/rhs/brick2/t1
Brick4: 10.70.47.163:/rhs/brick2/t1
Brick5: 10.70.47.165:/rhs/brick3/t1
Brick6: 10.70.47.163:/rhs/brick3/t1
Brick7: 10.70.47.165:/rhs/brick4/t1
Brick8: 10.70.47.163:/rhs/brick4/t1
Brick9: 10.70.47.165:/rhs/brick5/t1
Brick10: 10.70.47.163:/rhs/brick5/t1
Options Reconfigured: 
features.uss: enable  
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
cluster.min-free-disk: 10
performance.readdir-ahead: on
[root@rhsqa14-vm1 ~]# gluster v status
Status of volume: test
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.165:/rhs/brick1/t1           49152     0          Y       5626
Brick 10.70.47.163:/rhs/brick1/t1           49152     0          Y       7959
Brick 10.70.47.165:/rhs/brick2/t1           49153     0          Y       5644
Brick 10.70.47.163:/rhs/brick2/t1           49153     0          Y       7977
Brick 10.70.47.165:/rhs/brick3/t1           49155     0          Y       31157
Brick 10.70.47.163:/rhs/brick3/t1           49155     0          Y       9942
Brick 10.70.47.165:/rhs/brick4/t1           49156     0          Y       31278
Brick 10.70.47.163:/rhs/brick4/t1           49156     0          Y       10010
Brick 10.70.47.165:/rhs/brick5/t1           49157     0          Y       31411
Brick 10.70.47.163:/rhs/brick5/t1           49157     0          Y       10094
Snapshot Daemon on localhost                49154     0          Y       30947
NFS Server on localhost                     2049      0          Y       31431
Self-heal Daemon on localhost               N/A       N/A        Y       31439
Quota Daemon on localhost                   N/A       N/A        Y       31455
Snapshot Daemon on 10.70.47.163             49154     0          Y       9798
NFS Server on 10.70.47.163                  2049      0          Y       10114
Self-heal Daemon on 10.70.47.163            N/A       N/A        Y       10122
Quota Daemon on 10.70.47.163                N/A       N/A        Y       10130

Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks
[root@rhsqa14-vm1 ~]# kill -9 5626
[root@rhsqa14-vm1 ~]# gluster v status
Status of volume: test
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.165:/rhs/brick1/t1           N/A       N/A        N       N/A
Brick 10.70.47.163:/rhs/brick1/t1           49152     0          Y       7959
Brick 10.70.47.165:/rhs/brick2/t1           49153     0          Y       5644
Brick 10.70.47.163:/rhs/brick2/t1           49153     0          Y       7977
Brick 10.70.47.165:/rhs/brick3/t1           49155     0          Y       31157
Brick 10.70.47.163:/rhs/brick3/t1           49155     0          Y       9942
Brick 10.70.47.165:/rhs/brick4/t1           49156     0          Y       31278
Brick 10.70.47.163:/rhs/brick4/t1           49156     0          Y       10010
Brick 10.70.47.165:/rhs/brick5/t1           49157     0          Y       31411
Brick 10.70.47.163:/rhs/brick5/t1           49157     0          Y       10094
Snapshot Daemon on localhost                49154     0          Y       30947
NFS Server on localhost                     2049      0          Y       31431
Self-heal Daemon on localhost               N/A       N/A        Y       31439
Quota Daemon on localhost                   N/A       N/A        Y       31455
Snapshot Daemon on 10.70.47.163             49154     0          Y       9798
NFS Server on 10.70.47.163                  2049      0          Y       10114
Self-heal Daemon on 10.70.47.163            N/A       N/A        Y       10122
Quota Daemon on 10.70.47.163                N/A       N/A        Y       10130

Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks

[root@rhsqa14-vm1 ~]# gluster v rebalance test start force; gluster v rebalance test status
volume rebalance: test: success: Rebalance on test has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: aede79a4-65da-43c7-9989-cdf63601661c

                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             2             0             0          in progress               0.00
                            10.70.47.163                0        0Bytes             6             0             0          in progress               0.00
volume rebalance: test: success:
[root@rhsqa14-vm1 ~]# gluster v rebalance test status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              209         2.0MB           775             0             0          in progress              49.00
                            10.70.47.163                0        0Bytes           616             0           278          in progress              49.00
volume rebalance: test: success:
[root@rhsqa14-vm1 ~]# gluster v rebalance test status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              360         2.4MB          1333             0             0          in progress              86.00
                            10.70.47.163                0        0Bytes          1007             0           445          in progress              86.00
volume rebalance: test: success:
[root@rhsqa14-vm1 ~]# gluster v rebalance test status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              719         3.7MB          9500             0             0          in progress             247.00
                            10.70.47.163                0        0Bytes          3308             0           513            completed             200.00
volume rebalance: test: success:
[root@rhsqa14-vm1 ~]# gluster v rebalance test status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              719         3.7MB         10290             0             0            completed             259.00
                            10.70.47.163                0        0Bytes          3308             0           513            completed             200.00
volume rebalance: test: success:
[root@rhsqa14-vm1 ~]# 


LOg messages:

[root@rhsqa14-vm1 78]# cd /var/log/glusterfs/
[root@rhsqa14-vm1 glusterfs]# grep -rine brain test-rebalance.log 
468:[2015-07-17 11:26:19.084892] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-test-replicate-2: Failing SETATTR on gfid 21782092-29bb-46a5-b05d-b037a502c9ed: split-brain observed. [Input/output error]
469:[2015-07-17 11:26:19.084941] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-test-replicate-3: Failing SETATTR on gfid 21782092-29bb-46a5-b05d-b037a502c9ed: split-brain observed. [Input/output error]
470:[2015-07-17 11:26:19.084970] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-test-replicate-4: Failing SETATTR on gfid 21782092-29bb-46a5-b05d-b037a502c9ed: split-brain observed. [Input/output error]
2272:[2015-07-17 11:27:20.627396] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 0-test-replicate-2: Failing SETATTR on gfid caa0d78c-1a2a-4ae2-8840-dea7e85aeea4: split-brain observed. [Input/output error]
[root@rhsqa14-vm1 glusterfs]# 


To confirm split brain has not happened we checked the GFID of the affected files on both nodes:
=================================

[root@rhsqa14-vm1 78]# cd  /rhs/brick3/t1/.glusterfs/21/7
71/ 78/ 7a/ 7c/ 7e/
[root@rhsqa14-vm1 78]# cd  /rhs/brick3/t1/.glusterfs/21/78/
[root@rhsqa14-vm1 78]# ls -l 21782092-29bb-46a5-b05d-b037a502c9ed
lrwxrwxrwx. 1 root root 55 Jul 17 07:26 21782092-29bb-46a5-b05d-b037a502c9ed -> ../../4f/ee/4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c/stable
[root@rhsqa14-vm1 78]# getfattr -d -m . -e hex ../../4f/ee/4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c/stable
# file: ../../4f/ee/4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c/stable
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x2178209229bb46a5b05db037a502c9ed
trusted.glusterfs.dht=0x0000000100000000796c049abcb5e933
trusted.glusterfs.quota.4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c.contri=0x0000000000003c0000000000000000070000000000000001
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.size=0x0000000000003c0000000000000000070000000000000001


[root@rhsqa14-vm2 78]# Brick6: 10.70.47.163:/rhs/brick3/^C
[root@rhsqa14-vm2 78]# ls -l 21782092-29bb-46a5-b05d-b037a502c9ed
lrwxrwxrwx. 1 root root 55 Jul 17 07:26 21782092-29bb-46a5-b05d-b037a502c9ed -> ../../4f/ee/4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c/stable
[root@rhsqa14-vm2 78]# getfattr -d -m . -e hex ../../4f/ee/4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c/stable
# file: ../../4f/ee/4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c/stable
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x2178209229bb46a5b05db037a502c9ed
trusted.glusterfs.dht=0x0000000100000000796c049abcb5e933
trusted.glusterfs.quota.4fee1cc5-e3d9-4cf3-a31d-294a2fe59f9c.contri=0x0000000000003c0000000000000000070000000000000001
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.size=0x0000000000003c0000000000000000070000000000000001

[root@rhsqa14-vm2 78]# 


Log messages which showed permission denied :
===========================================

[2015-07-17 11:26:15.214084] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/.gitignore: attempting to move from test-replicate-1 to test-replicate-3
[2015-07-17 11:26:15.289601] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-6: remote operation failed [Permission denied]
[2015-07-17 11:26:15.311960] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/MAINTAINERS: attempting to move from test-replicate-1 to test-replicate-3
[2015-07-17 11:26:15.413101] I [dht-rebalance.c:2328:gf_defrag_process_dir] 0-test-dht: Migration operation on dir /linux-4.1-rc7 took 0.23 secs
[2015-07-17 11:26:15.413739] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-6: remote operation failed [Permission denied]
[2015-07-17 11:26:15.434240] I [MSGID: 109081] [dht-common.c:3783:dht_setxattr] 0-test-dht: fixing the layout of /linux-4.1-rc7/Documentation
[2015-07-17 11:26:15.434287] I [MSGID: 109045] [dht-selfheal.c:1496:dht_fix_layout_of_directory] 0-test-dht: subvolume 0 (test-replicate-0): 20480 chunks
[2015-07-17 11:26:15.434300] I [MSGID: 109045] [dht-selfheal.c:1496:dht_fix_layout_of_directory] 0-test-dht: subvolume 1 (test-replicate-1): 20480 chunks
[2015-07-17 11:26:15.434311] I [MSGID: 109045] [dht-selfheal.c:1496:dht_fix_layout_of_directory] 0-test-dht: subvolume 2 (test-replicate-2): 50914 chunks
[2015-07-17 11:26:15.434324] I [MSGID: 109045] [dht-selfheal.c:1496:dht_fix_layout_of_directory] 0-test-dht: subvolume 3 (test-replicate-3): 50914 chunks
[2015-07-17 11:26:15.434335] I [MSGID: 109045] [dht-selfheal.c:1496:dht_fix_layout_of_directory] 0-test-dht: subvolume 4 (test-replicate-4): 50914 chunks
[2015-07-17 11:26:15.459900] I [dht-rebalance.c:2119:gf_defrag_process_dir] 0-test-dht: migrate data called on /linux-4.1-rc7/Documentation
[2015-07-17 11:26:15.729080] I [MSGID: 109022] [dht-rebalance.c:1282:dht_migrate_file] 0-test-dht: completed migration of /linux-4.1-rc7/.gitignore from subvolume test-replicate-1 to test-replicate-3
[2015-07-17 11:26:15.766935] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/Documentation/BUG-HUNTING: attempting to move from test-replicate-1 to test-replicate-4
[2015-07-17 11:26:15.829610] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-8: remote operation failed [Permission denied]
[2015-07-17 11:26:15.841553] I [MSGID: 109022] [dht-rebalance.c:1282:dht_migrate_file] 0-test-dht: completed migration of /linux-4.1-rc7/MAINTAINERS from subvolume test-replicate-1 to test-replicate-3
[2015-07-17 11:26:15.856166] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/Documentation/CodingStyle: attempting to move from test-replicate-1 to test-replicate-3
[2015-07-17 11:26:15.936478] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-6: remote operation failed [Permission denied]
[2015-07-17 11:26:16.081485] I [MSGID: 109022] [dht-rebalance.c:1282:dht_migrate_file] 0-test-dht: completed migration of /linux-4.1-rc7/Documentation/BUG-HUNTING from subvolume test-replicate-1 to test-replicate-4
[2015-07-17 11:26:16.093080] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/Documentation/IPMI.txt: attempting to move from test-replicate-1 to test-replicate-4
[2015-07-17 11:26:16.220994] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-8: remote operation failed [Permission denied]
[2015-07-17 11:26:16.236577] I [MSGID: 109022] [dht-rebalance.c:1282:dht_migrate_file] 0-test-dht: completed migration of /linux-4.1-rc7/Documentation/CodingStyle from subvolume test-replicate-1 to test-replicate-3
[2015-07-17 11:26:16.244167] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/Documentation/SecurityBugs: attempting to move from test-replicate-1 to test-replicate-3
[2015-07-17 11:26:16.272917] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-6: remote operation failed [Permission denied]
[2015-07-17 11:26:16.483936] I [MSGID: 109022] [dht-rebalance.c:1282:dht_migrate_file] 0-test-dht: completed migration of /linux-4.1-rc7/Documentation/SecurityBugs from subvolume test-replicate-1 to test-replicate-3
[2015-07-17 11:26:16.498522] I [MSGID: 109022] [dht-rebalance.c:1282:dht_migrate_file] 0-test-dht: completed migration of /linux-4.1-rc7/Documentation/IPMI.txt from subvolume test-replicate-1 to test-replicate-4
[2015-07-17 11:26:16.509024] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/Documentation/SubmittingPatches: attempting to move from test-replicate-1 to test-replicate-3
[2015-07-17 11:26:16.512841] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /linux-4.1-rc7/Documentation/braille-console.txt: attempting to move from test-replicate-1 to test-replicate-3
[2015-07-17 11:26:16.550083] W [MSGID: 114031] [client-rpc-fops.c:1977:client3_3_fsetxattr_cbk] 0-test-client-6: remote operation failed [Permission denied]

Comment 2 Triveni Rao 2015-07-17 13:10:55 UTC
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1244197/ 

sosreports uploaded.

Comment 5 Susant Kumar Palai 2015-09-24 10:17:07 UTC
moving the component to AFR as it is reported as a split brain issue.

Comment 6 Pranith Kumar K 2015-12-24 06:25:12 UTC
Rebalance process was not sending lookups in some cases which is fixed as part of bug 1243542. You can see similar messages for WRITE in that bug: [2015-07-15 18:12:15.544061] E [MSGID: 108008] [afr-transaction.c:1984:afr_transaction] 12-vol1-replicate-5: Failing WRITE on gfid b7d21675-6fd8-472a-b7d9-71d7436c614d: split-brain observed. [Input/output error].

I was trying to find more about permission denied problems. Only client logs are present and brick logs don't give any permission denied problems. Is this issue happening still? Can we get new sosreports with this issue?


Note You need to log in before you can comment on or make changes to this bug.