Description of problem: ----------------------- App VMs are getting paused after 'remove-brick start' operation, with an error, "storage domain error" Version-Release number of selected component (if applicable): ------------------------------------------------------------- RHSS 2.1 Update2 - glusterfs-3.4.0.58rhs-1.el6rhs How reproducible: ----------------- It happened once out of two times Steps to Reproduce: ------------------- 1. Distributed replicate volume of 8X2 was serving as VM Image store in RHEV 2. Created a App VM with the bootable image of 30GB 3. Installed this App VM with RHEL6.4 4. IO was running inside this App VM simulated with 'dd'command (i.e) while true;do dd if=/dev/urandom of=/home/file1 bs=1024k count=1000; done 5. Found out the replica pair containing the disk image and removed those bricks (i.e) gluster volume remove-brick <vol-name> <brick1> <brick2> start Actual results: ---------------- After sometime noticed that the App VM got paused, due to storage domain error Expected results: ----------------- App VM should not get paused and continue to work, while remove-brick is in progress Additional info: ---------------- Installation Info ================== 1. RHSS ISO Used - RHSS2.1Update2 RC - RHSS-2.1-20140116.2-RHS-x86_64-DVD1 available @ http://download.devel.redhat.com/composes/candidates/RHSS-2.1-20140116.2/2.1/RHS/x86_64/iso/RHSS-2.1-20140116.2-RHS-x86_64-DVD1.iso 2. Packages - glusterfs-3.4.0.58rhs-1.el6rhs available @ http://download.devel.redhat.com/brewroot/packages/glusterfs/3.4.0.58rhs/1.el6rhs/x86_64/ Setup Info =========== 1. There were 2 RHSS Nodes in the trusted storage pool (i.e) rhss4.lab.eng.blr.redhat.com - 10.70.37.198 rhss3.lab.eng.blr.redhat.com - 10.70.37.46 2. All the commands are executed from rhss4.lab.eng.blr.redhat.com (i.e) 10.70.37.198 Client side Info ================ 1. rhs-client15.lab.eng.blr.redhat.com - 10.70.36.39, installed with RHEL 6.5 was used as hypervisor 2. gluster volume was mounted at, /rhev/data-center/mnt/glusterSD/10.70.37.198:dr-store [root@rhs-client15 ~]# df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/vg_rhsclient15-lv_root ext4 50G 2.9G 44G 7% / tmpfs tmpfs 7.8G 4.0K 7.8G 1% /dev/shm /dev/sda1 ext4 485M 96M 364M 21% /boot /dev/mapper/vg_rhsclient15-lv_home ext4 1.8T 196M 1.7T 1% /home 10.70.37.198:dr-store fuse.glusterfs 800G 12G 788G 2% /rhev/data-center/mnt/glusterSD/10.70.37.198:dr-store 3. Image file that was under the test: [root@rhs-client15 ~]# ls -l /rhev/data-center/mnt/glusterSD/10.70.37.198\:dr-store/bb81265f-4402-4d3d-8227-00dc546fccc8/images/49e958ff-74e4-4fdc-8880-92b545c22ce4/ total 2355433 -rw-rw----. 1 vdsm kvm 32212254720 Jan 29 10:40 7b5cd668-90f5-465e-a4ed-610c4a182e1a -rw-rw----. 1 vdsm kvm 1048576 Jan 29 04:36 7b5cd668-90f5-465e-a4ed-610c4a182e1a.lease -rw-r--r--. 1 vdsm kvm 268 Jan 29 04:36 7b5cd668-90f5-465e-a4ed-610c4a182e1a.meta [root@rhs-client15 ~]# qemu-img info /rhev/data-center/mnt/glusterSD/10.70.37.198\:dr-store/bb81265f-4402-4d3d-8227-00dc546fccc8/images/49e958ff-74e4-4fdc-8880-92b545c22ce4/7b5cd668-90f5-465e-a4ed-610c4a182e1a image: /rhev/data-center/mnt/glusterSD/10.70.37.198:dr-store/bb81265f-4402-4d3d-8227-00dc546fccc8/images/49e958ff-74e4-4fdc-8880-92b545c22ce4/7b5cd668-90f5-465e-a4ed-610c4a182e1a file format: raw virtual size: 30G (32212254720 bytes) disk size: 2.2G Volume Info ============ [Thu Jan 30 09:25:27 UTC 2014 root.37.198:~ ] # gluster v i dr-store Volume Name: dr-store Type: Distributed-Replicate Volume ID: 41e2a46b-d320-45c1-810c-19900b1c8bcc Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: 10.70.37.198:/rhs/brick1/dsr1 Brick2: 10.70.37.46:/rhs/brick1/dsr1 Brick3: 10.70.37.198:/rhs/brick2/dsr2 Brick4: 10.70.37.46:/rhs/brick2/dsr2 Brick5: 10.70.37.198:/rhs/brick3/dsr3 Brick6: 10.70.37.46:/rhs/brick3/dsr3 Brick7: 10.70.37.198:/rhs/brick4/dsr4 Brick8: 10.70.37.46:/rhs/brick4/dsr4 Brick9: 10.70.37.198:/rhs/brick4/dsr5 Brick10: 10.70.37.46:/rhs/brick4/dsr5 Brick11: 10.70.37.198:/rhs/brick4/dsr6 Brick12: 10.70.37.46:/rhs/brick4/dsr6 Brick13: 10.70.37.198:/rhs/brick4/dsr7 Brick14: 10.70.37.46:/rhs/brick4/dsr7 Brick15: 10.70.37.198:/rhs/brick4/dsr8 Brick16: 10.70.37.46:/rhs/brick4/dsr8 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 cluster.server-quorum-type: server cluster.quorum-type: auto network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off auth.allow: * user.cifs: enable nfs.disable: off 2. Volume status [Thu Jan 30 11:20:09 UTC 2014 root.37.198:~ ] # gluster volume status Status of volume: dr-store Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.37.198:/rhs/brick1/dsr1 49155 Y 20689 Brick 10.70.37.46:/rhs/brick1/dsr1 49153 Y 753 Brick 10.70.37.198:/rhs/brick2/dsr2 49156 Y 20700 Brick 10.70.37.46:/rhs/brick2/dsr2 49154 Y 764 Brick 10.70.37.198:/rhs/brick3/dsr3 49157 Y 23733 Brick 10.70.37.46:/rhs/brick3/dsr3 49155 Y 3684 Brick 10.70.37.198:/rhs/brick4/dsr4 49158 Y 24228 Brick 10.70.37.46:/rhs/brick4/dsr4 49156 Y 3888 Brick 10.70.37.198:/rhs/brick4/dsr5 49159 Y 24574 Brick 10.70.37.46:/rhs/brick4/dsr5 49157 Y 4089 Brick 10.70.37.198:/rhs/brick4/dsr6 49160 Y 25867 Brick 10.70.37.46:/rhs/brick4/dsr6 49158 Y 5104 Brick 10.70.37.198:/rhs/brick4/dsr7 49161 Y 26001 Brick 10.70.37.46:/rhs/brick4/dsr7 49159 Y 5409 Brick 10.70.37.198:/rhs/brick4/dsr8 49162 Y 26215 Brick 10.70.37.46:/rhs/brick4/dsr8 49160 Y 5572 NFS Server on localhost 2049 Y 26227 Self-heal Daemon on localhost N/A Y 26234 NFS Server on 10.70.37.46 2049 Y 5596 Self-heal Daemon on 10.70.37.46 N/A Y 5603 Task Status of Volume dr-store ------------------------------------------------------------------------------ Task : Remove brick ID : af6839fa-9009-458d-891b-eb4eed118ca4 Removed bricks: 10.70.37.198:/rhs/brick2/dsr2 10.70.37.46:/rhs/brick2/dsr2 Status : completed
Created attachment 857429 [details] sosreport from RHSS Node1 Attached the sosreports from RHSS Node1 - 10.70.37.198
Possible error statement from fuse logs : <snip> [2014-01-29 21:08:18.402715] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.402761] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.402771] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed. [2014-01-29 21:08:18.402796] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166265: WRITE => -1 (Invalid argument) [2014-01-29 21:08:18.403079] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.403114] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.403123] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed. [2014-01-29 21:08:18.403141] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166266: WRITE => -1 (Invalid argument) [2014-01-29 21:08:18.403514] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.403535] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.403544] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed. [2014-01-29 21:08:18.403561] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166268: WRITE => -1 (Invalid argument) [2014-01-29 21:08:18.403924] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.403944] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.403953] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed. [2014-01-29 21:08:18.403970] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166270: WRITE => -1 (Invalid argument) [2014-01-29 21:08:18.407191] I [afr-lk-common.c:1528:afr_nonblocking_inodelk] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.407229] I [afr-lk-common.c:1075:afr_lock_blocking] 6-dr-store-replicate-6: unable to get fd ctx for fd=0x7498f48 [2014-01-29 21:08:18.407239] I [afr-transaction.c:1119:afr_post_blocking_inodelk_cbk] 6-dr-store-replicate-6: Blocking inodelks failed. [2014-01-29 21:08:18.407260] W [fuse-bridge.c:2696:fuse_writev_cbk] 0-glusterfs-fuse: 166272: WRITE => -1 (Invalid argument) [2014-01-29 21:08:19.712010] I [dht-layout.c:760:dht_layout_dir_mismatch] 6-dr-store-dht: subvol: dr-store-replicate-0; inode layout - 3067833780 - 3681400535; disk layout - 2454267024 - 3067833779 [2014-01-29 21:08:19.712054] I [dht-common.c:639:dht_revalidate_cbk] 6-dr-store-dht: mismatching layouts for /bb81265f-4402-4d3d-8227-00dc546fccc8/master [2014-01-29 21:08:19.712147] I [dht-layout.c:760:dht_layout_dir_mismatch] 6-dr-store-dht: subvol: dr-store-replicate-7; inode layout - 2454267024 - 3067833779; disk layout - 3067833780 - 3681400535 [2014-01-29 21:08:19.712175] I [dht-common.c:639:dht_revalidate_cbk] 6-dr-store-dht: mismatching layouts for /bb81265f-4402-4d3d-8227-00dc546fccc8/master </snip>
This issue looked similar to the bug encountered in RHS-RHOS enviroment wrt glance, https://bugzilla.redhat.com/show_bug.cgi?id=1054782 After discussing with Shyam, he recommended to create a seperate bug as the test case is different wrt to this issue
The same test case passed in Test Run for RHSS 2.1 Update1 Ref - https://tcms.engineering.redhat.com/run/90297/?from_plan=7048#caserun_3503134 So adding "REGRESSION" keyword to this bug
Notes providing further information: 1. "remove-brick start" operation was not yet committed and VMs went to paused state even before that Consequence of App VM after they went to paused state: 1. App VM could not be resumed from RHEVM UI and this doesn't get started 2. Did a forced shutdown and that restarted the App VM App VM came up and noticed some error messages in /var/log/dmesg <snip> EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1308175 EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1308168 EXT4-fs (dm-0): 80 orphan inodes deleted EXT4-fs (dm-0): recovery complete </snip> Above statement means some FS corruption has happened and fsck has rectified those errors
Created attachment 857471 [details] error message from /var/log/dmesg error message from /var/log/dmesg
This bug has the same root cause as Bug#1059687 This is a race condition that has existed across a couple of releases, hence this is not a regression as we have hit this only now. We now have a reproducible scenario where we hit this problem, for further analysis and fix (as mentioned in the other bug). Request that the regression flag hence be removed.
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1054782#c10, this should not be treated as a regression/blocker. https://bugzilla.redhat.com/show_bug.cgi?id=1059687 is a forked bug which was filed for this specific case.
(In reply to Shyamsundar from comment #9) > This bug has the same root cause as Bug#1059687 [snip] > Request that the regression flag hence be removed. Ack and thanks for looking into this.
Dev ack to 3.0 RHS BZs
Cloning this to 3.1. To be fixed in future.