Bug 1059691 - [RHEV-RHS] App VMs getting paused, after 'remove-brick start' operation
Summary: [RHEV-RHS] App VMs getting paused, after 'remove-brick start' operation
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks: 1286147 1286148
TreeView+ depends on / blocked
 
Reported: 2014-01-30 11:26 UTC by SATHEESARAN
Modified: 2015-11-27 12:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1286147 1286148 (view as bug list)
Environment:
virt rhev integration
Last Closed: 2015-11-27 11:45:17 UTC
Embargoed:


Attachments (Terms of Use)
sosreport from RHSS Node1 (14.58 MB, application/x-xz)
2014-01-30 11:38 UTC, SATHEESARAN
no flags Details
error message from /var/log/dmesg (6.64 KB, text/plain)
2014-01-30 13:12 UTC, SATHEESARAN
no flags Details

Description SATHEESARAN 2014-01-30 11:26:36 UTC
Description of problem:
-----------------------
App VMs are getting paused after 'remove-brick start' operation, with an error, "storage domain error"

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHSS 2.1 Update2 - glusterfs-3.4.0.58rhs-1.el6rhs

How reproducible:
-----------------
It happened once out of two times

Steps to Reproduce:
-------------------
1. Distributed replicate volume of 8X2 was serving as VM Image store in RHEV 
2. Created a App VM with the bootable image of 30GB
3. Installed this App VM with RHEL6.4
4. IO was running inside this App VM simulated with 'dd'command
(i.e) while true;do dd if=/dev/urandom of=/home/file1 bs=1024k count=1000; done
5. Found out the replica pair containing the disk image and removed those bricks
(i.e) gluster volume remove-brick <vol-name> <brick1> <brick2> start

Actual results:
----------------
After sometime noticed that the App VM got paused, due to storage domain error

Expected results:
-----------------
App VM should not get paused and continue to work, while remove-brick is in progress

Additional info:
----------------

Installation Info
==================
1. RHSS ISO Used - RHSS2.1Update2 RC - RHSS-2.1-20140116.2-RHS-x86_64-DVD1
available @ http://download.devel.redhat.com/composes/candidates/RHSS-2.1-20140116.2/2.1/RHS/x86_64/iso/RHSS-2.1-20140116.2-RHS-x86_64-DVD1.iso

2. Packages - glusterfs-3.4.0.58rhs-1.el6rhs
available @ http://download.devel.redhat.com/brewroot/packages/glusterfs/3.4.0.58rhs/1.el6rhs/x86_64/

Setup Info
===========
1. There were 2 RHSS Nodes in the trusted storage pool
(i.e) rhss4.lab.eng.blr.redhat.com - 10.70.37.198
      rhss3.lab.eng.blr.redhat.com - 10.70.37.46

2. All the commands are executed from rhss4.lab.eng.blr.redhat.com (i.e) 10.70.37.198

Client side Info
================
1. rhs-client15.lab.eng.blr.redhat.com - 10.70.36.39, installed with RHEL 6.5 was used as hypervisor

2. gluster volume was mounted at,
/rhev/data-center/mnt/glusterSD/10.70.37.198:dr-store

[root@rhs-client15 ~]# df -Th
Filesystem                         Type            Size  Used Avail Use% Mounted on
/dev/mapper/vg_rhsclient15-lv_root ext4             50G  2.9G   44G   7% /
tmpfs                              tmpfs           7.8G  4.0K  7.8G   1% /dev/shm
/dev/sda1                          ext4            485M   96M  364M  21% /boot
/dev/mapper/vg_rhsclient15-lv_home ext4            1.8T  196M  1.7T   1% /home
10.70.37.198:dr-store              fuse.glusterfs  800G   12G  788G   2% /rhev/data-center/mnt/glusterSD/10.70.37.198:dr-store

3. Image file that was under the test:
[root@rhs-client15 ~]# ls -l /rhev/data-center/mnt/glusterSD/10.70.37.198\:dr-store/bb81265f-4402-4d3d-8227-00dc546fccc8/images/49e958ff-74e4-4fdc-8880-92b545c22ce4/
total 2355433
-rw-rw----. 1 vdsm kvm 32212254720 Jan 29 10:40 7b5cd668-90f5-465e-a4ed-610c4a182e1a
-rw-rw----. 1 vdsm kvm     1048576 Jan 29 04:36 7b5cd668-90f5-465e-a4ed-610c4a182e1a.lease
-rw-r--r--. 1 vdsm kvm         268 Jan 29 04:36 7b5cd668-90f5-465e-a4ed-610c4a182e1a.meta

[root@rhs-client15 ~]# qemu-img info /rhev/data-center/mnt/glusterSD/10.70.37.198\:dr-store/bb81265f-4402-4d3d-8227-00dc546fccc8/images/49e958ff-74e4-4fdc-8880-92b545c22ce4/7b5cd668-90f5-465e-a4ed-610c4a182e1a
image: /rhev/data-center/mnt/glusterSD/10.70.37.198:dr-store/bb81265f-4402-4d3d-8227-00dc546fccc8/images/49e958ff-74e4-4fdc-8880-92b545c22ce4/7b5cd668-90f5-465e-a4ed-610c4a182e1a
file format: raw
virtual size: 30G (32212254720 bytes)
disk size: 2.2G

Volume Info
============
[Thu Jan 30 09:25:27 UTC 2014 root.37.198:~ ] # gluster v i dr-store                                                                                                              
                                                                                                                                                                                        
Volume Name: dr-store                                                                                                                                                                   
Type: Distributed-Replicate                                                                                                                                                             
Volume ID: 41e2a46b-d320-45c1-810c-19900b1c8bcc                                                                                                                                         
Status: Started                                                                                                                                                                         
Number of Bricks: 8 x 2 = 16                                                                                                                                                            
Transport-type: tcp                                                                                                                                                                     
Bricks:                                                                                                                                                                                 
Brick1: 10.70.37.198:/rhs/brick1/dsr1                                                                                                                                                   
Brick2: 10.70.37.46:/rhs/brick1/dsr1                                                                                                                                                    
Brick3: 10.70.37.198:/rhs/brick2/dsr2                                                                                                                                                   
Brick4: 10.70.37.46:/rhs/brick2/dsr2
Brick5: 10.70.37.198:/rhs/brick3/dsr3
Brick6: 10.70.37.46:/rhs/brick3/dsr3
Brick7: 10.70.37.198:/rhs/brick4/dsr4
Brick8: 10.70.37.46:/rhs/brick4/dsr4
Brick9: 10.70.37.198:/rhs/brick4/dsr5
Brick10: 10.70.37.46:/rhs/brick4/dsr5
Brick11: 10.70.37.198:/rhs/brick4/dsr6
Brick12: 10.70.37.46:/rhs/brick4/dsr6
Brick13: 10.70.37.198:/rhs/brick4/dsr7
Brick14: 10.70.37.46:/rhs/brick4/dsr7
Brick15: 10.70.37.198:/rhs/brick4/dsr8
Brick16: 10.70.37.46:/rhs/brick4/dsr8
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
auth.allow: *
user.cifs: enable
nfs.disable: off

2. Volume status
[Thu Jan 30 11:20:09 UTC 2014 root.37.198:~ ] # gluster volume status
Status of volume: dr-store
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.198:/rhs/brick1/dsr1                     49155   Y       20689
Brick 10.70.37.46:/rhs/brick1/dsr1                      49153   Y       753
Brick 10.70.37.198:/rhs/brick2/dsr2                     49156   Y       20700
Brick 10.70.37.46:/rhs/brick2/dsr2                      49154   Y       764
Brick 10.70.37.198:/rhs/brick3/dsr3                     49157   Y       23733
Brick 10.70.37.46:/rhs/brick3/dsr3                      49155   Y       3684
Brick 10.70.37.198:/rhs/brick4/dsr4                     49158   Y       24228
Brick 10.70.37.46:/rhs/brick4/dsr4                      49156   Y       3888
Brick 10.70.37.198:/rhs/brick4/dsr5                     49159   Y       24574
Brick 10.70.37.46:/rhs/brick4/dsr5                      49157   Y       4089
Brick 10.70.37.198:/rhs/brick4/dsr6                     49160   Y       25867
Brick 10.70.37.46:/rhs/brick4/dsr6                      49158   Y       5104
Brick 10.70.37.198:/rhs/brick4/dsr7                     49161   Y       26001
Brick 10.70.37.46:/rhs/brick4/dsr7                      49159   Y       5409
Brick 10.70.37.198:/rhs/brick4/dsr8                     49162   Y       26215
Brick 10.70.37.46:/rhs/brick4/dsr8                      49160   Y       5572
NFS Server on localhost                                 2049    Y       26227
Self-heal Daemon on localhost                           N/A     Y       26234
NFS Server on 10.70.37.46                               2049    Y       5596
Self-heal Daemon on 10.70.37.46                         N/A     Y       5603
 
Task Status of Volume dr-store
------------------------------------------------------------------------------
Task                 : Remove brick        
ID                   : af6839fa-9009-458d-891b-eb4eed118ca4
Removed bricks:     
10.70.37.198:/rhs/brick2/dsr2
10.70.37.46:/rhs/brick2/dsr2
Status               : completed

Comment 1 SATHEESARAN 2014-01-30 11:38:11 UTC
Created attachment 857429 [details]
sosreport from RHSS Node1

Attached the sosreports from RHSS Node1 - 10.70.37.198

Comment 3 SATHEESARAN 2014-01-30 11:49:37 UTC
Possible error statement from fuse logs :

<snip>
[2014-01-29 21:08:18.402715] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.402761] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.402771] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed.
[2014-01-29 21:08:18.402796] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166265: WRITE => -1 (Invalid argument)
[2014-01-29 21:08:18.403079] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.403114] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.403123] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed.
[2014-01-29 21:08:18.403141] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166266: WRITE => -1 (Invalid argument)
[2014-01-29 21:08:18.403514] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.403535] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.403544] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed.
[2014-01-29 21:08:18.403561] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166268: WRITE => -1 (Invalid argument)
[2014-01-29 21:08:18.403924] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.403944] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.403953] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed.
[2014-01-29 21:08:18.403970] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166270: WRITE => -1 (Invalid argument)
[2014-01-29 21:08:18.407191] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.407229] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48
[2014-01-29 21:08:18.407239] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed.
[2014-01-29 21:08:18.407260] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166272: WRITE => -1 (Invalid argument)
[2014-01-29 21:08:19.712010] I [dht-layout.c:760:dht_layout_dir_mismatch] 6-dr-store-dht: subvol: dr-store-replicate-0; inode layout - 3067833780 - 3681400535; disk layout - 2454267024 - 3067833779
[2014-01-29 21:08:19.712054] I [dht-common.c:639:dht_revalidate_cbk] 6-dr-store-dht: mismatching layouts for /bb81265f-4402-4d3d-8227-00dc546fccc8/master
[2014-01-29 21:08:19.712147] I [dht-layout.c:760:dht_layout_dir_mismatch] 6-dr-store-dht: subvol: dr-store-replicate-7; inode layout - 2454267024 - 3067833779; disk layout - 3067833780 - 3681400535
[2014-01-29 21:08:19.712175] I [dht-common.c:639:dht_revalidate_cbk] 6-dr-store-dht: mismatching layouts for /bb81265f-4402-4d3d-8227-00dc546fccc8/master
</snip>

Comment 4 SATHEESARAN 2014-01-30 11:51:07 UTC
This issue looked similar to the bug encountered in RHS-RHOS enviroment wrt glance, https://bugzilla.redhat.com/show_bug.cgi?id=1054782

After discussing with Shyam, he recommended to create a seperate bug as the test case is different wrt to this issue

Comment 5 SATHEESARAN 2014-01-30 11:55:33 UTC
The same test case passed in Test Run for RHSS 2.1 Update1
Ref - https://tcms.engineering.redhat.com/run/90297/?from_plan=7048#caserun_3503134

So adding "REGRESSION" keyword to this bug

Comment 7 SATHEESARAN 2014-01-30 13:10:10 UTC
Notes providing further information:

1. "remove-brick start" operation was not yet committed and VMs went to paused state even before that


Consequence of App VM after they went to paused state:
1. App VM could not be resumed from RHEVM UI
and this doesn't get started
2. Did a forced shutdown and that restarted the App VM
App VM came up and noticed some error messages in /var/log/dmesg

<snip>
EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1308175
EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1308168
EXT4-fs (dm-0): 80 orphan inodes deleted
EXT4-fs (dm-0): recovery complete
</snip>

Above statement means some FS corruption has happened and fsck has rectified those errors

Comment 8 SATHEESARAN 2014-01-30 13:12:28 UTC
Created attachment 857471 [details]
error message from /var/log/dmesg

error message from /var/log/dmesg

Comment 9 Shyamsundar 2014-01-31 05:56:24 UTC
This bug has the same root cause as Bug#1059687

This is a race condition that has existed across a couple of releases, hence this is not a regression as we have hit this only now.

We now have a reproducible scenario where we hit this problem, for further analysis and fix (as mentioned in the other bug).

Request that the regression flag hence be removed.

Comment 10 Vivek Agarwal 2014-01-31 06:29:46 UTC
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1054782#c10, this should not be treated as a regression/blocker.

https://bugzilla.redhat.com/show_bug.cgi?id=1059687 is a forked bug which was filed for this specific case.

Comment 11 Gowrishankar Rajaiyan 2014-01-31 06:37:35 UTC
(In reply to Shyamsundar from comment #9)
> This bug has the same root cause as Bug#1059687
[snip]
> Request that the regression flag hence be removed.

Ack and thanks for looking into this.

Comment 12 Nagaprasad Sathyanarayana 2014-05-06 11:43:37 UTC
Dev ack to 3.0 RHS BZs

Comment 14 Susant Kumar Palai 2015-11-27 11:45:17 UTC
Cloning this to 3.1. To be fixed in future.


Note You need to log in before you can comment on or make changes to this bug.