Description of problem: Did a replace brick by running the command 'gluster volume replace-brick data zod.lab.eng.blr.redhat.com:/rhgs/data/data-brick3 <old brick> zod.lab.eng.blr.redhat.com:/rhgs/data/data-brick4 <new brick> commit force. I see that some of my vms goes to paused state and gluster vol heal data info always reports /.shard . Output from gluster vol heal data info: [root@zod data]# gluster vol heal data info Brick sulphur.lab.eng.blr.redhat.com:/rhgs/data/data-brick1 /.shard Status: Connected Number of entries: 1 Brick tettnang.lab.eng.blr.redhat.com:/rhgs/data/data-brick2 /.shard Status: Connected Number of entries: 1 Brick zod.lab.eng.blr.redhat.com:/rhgs/data/data-brick4 Status: Connected Number of entries: 0 Version-Release number of selected component (if applicable): glusterfs-3.7.9-4.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1. Have I/O running on your vms. 2. Run the command to replace the brick gluster volume replace-brick data zod.lab.eng.blr.redhat.com:/rhgs/data/data-brick3 <old brick> zod.lab.eng.blr.redhat.com:/rhgs/data/data-brick4 <new brick> commit force 3. Actual results: VMs goes to paused state and /.shard is always reported in the heal info. Expected results: Vms should not go to paused state and heal info should be not always report /.shard in the output. Additional info:
Failures seen in mount log of data : ======================================= [2016-05-16 05:34:53.939937] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 2-data-client-2: remote operation failed [No such file or directory] The message "W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 2-data-client-2: remote operation failed [No such file or directory]" repeated 9 times betwe en [2016-05-16 05:34:53.939937] and [2016-05-16 05:34:53.941235] [2016-05-16 05:34:53.941673] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-0: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:53.941891] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022670: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.941942] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022675: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.941961] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022668: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.944489] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022673: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.944924] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022671: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.945027] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022674: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.945172] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022669: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.945234] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022672: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.945305] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022677: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.945402] W [fuse-bridge.c:2305:fuse_writev_cbk] 0-glusterfs-fuse: 5022676: WRITE => -1 (Input/output error) [2016-05-16 05:34:53.949119] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-2: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:53.950339] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-1: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:53.952624] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-0: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:53.956553] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-2: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:53.957084] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-1: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.310983] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-0: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.316845] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-2: remote operation failed. Path: (null) (00000000-0000-0000- 0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.317162] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.317443] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.321221] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.322126] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-05-16 05:34:54.323285] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-data-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument]
Adding blocker flag to the release, as the replace brick results in the VM pause. Replacing the brick is the only solution to remove the failed hard drives from the the volume configuration.
Regarding /.shard always being in heal info output : Some files were created on the backend separately in /.shard directory, meaning these files were not created from the mount. When self-heal was triggered, as lookup on this file failed with ENODATA, undo pending was not done. As a result, the pending markers on /.shard were not cleared which is the reason it kept showing up in the heal info output even though there was no heal needed. Given that these files were created directly on the backend and not from mount, this case is not valid. Kasturi, I'm adding a need info on you to confirm that "/.shard always shows up in heal info" in this bug was logged corresponding to the same issue I debugged yesterday. VM pause on replace-brick still needs to be looked into.
yes anuradha, agree with you. /.shard was always seen in the heal info output because of the out.txt being present in the backend. Now heal info shows zero entries which means the heal is completed. But i am unable to boot one of vm which was in paused state. vm says no bootable device and i do see that there are some entries shown in the heal info output of the brick which is replaced.
RCA is still in progress is what I heard from AFR team.
Just updating with the discussion our team had about this issue: http://review.gluster.org/14369 should most probably fix this issue as well. Looks like similar symptoms are faced when this issue is observed.
Kasturi, Could you confirm that the issue is seen on latest release as well which has the file mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1336295#c8 Pranith
Downstream patch https://code.engineering.redhat.com/gerrit/74759 has made into rhgs-3.1.3. As per comment 8 since dev feels that this issue is been taken care by this patch moving the bug state to ON_QA.
Verified and works fine with build glusterfs-3.7.9-6.el7rhgs.x86_64. Did a replace-brick commit force on the old brick to the new one while I/O is happening on the volume. Replace-brick operation went successfully and did not see any vm pauses. I did observe some behavior on the vms while i performed the above exercise, it says "kernel:BUG: soft lockup - CPU#1 stuck for 22s! [fio:22835]" and event messages on UI says "storage domain data experienced high latency of 9.466 seconds from host, this may cause performance and functional issues.Please consult your storage administrator". There is patch for addressing latency, will verify for the soft lock there again. Since the actual problem of vm going to pause state while replace brick is happening got fixed moving this to on_qa. will reopen this if i see the issue.
Since the actual problem of vm going to pause state while replace brick is happening got fixed moving this to verified. will reopen this, if happens again
(In reply to RamaKasturi from comment #13) > I did observe some behavior on the vms while i performed the above exercise, > it says "kernel:BUG: soft lockup - CPU#1 stuck for 22s! [fio:22835]" and > event messages on UI says "storage domain data experienced high latency of > 9.466 seconds from host, this may cause performance and functional > issues.Please consult your storage administrator". There is patch for > addressing latency, will verify for the soft lock there again. > For the worth of information : The latency issue was tracked under a separate bug - https://bugzilla.redhat.com/show_bug.cgi?id=1339136 , for which Krutika was provided a private build with O_DIRECT enabled.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240