Description of problem: We have 2 datacenters. 1) NFS datacenter 2) Posixfs datacenter Over the NFS datacenter, we created 4 RHS VMs. Each of these VMs had an additonal virtual thinprovisioned disk of 300G over which 3 LVM volumes of 90G each were created and mounted. A 6x2 Distributed Replicate RHS volume was created over the VMs here with the LVMs as RHS bricks. In the posixfs datacenter, we added the 6x2 Distributed Replicate RHS volume as a storagedomain. This storagedomain was used as vm-store. Plugging out a disk in any of the one VMs brought down the entire posixfs storagedomain. Version-Release number of selected component (if applicable): RHEV-H: rhev-hypervisor6-6.4-20130214.0.auto191.el6.iso RHS: RHS-2.0-20130219.3-RHS-x86_64-DVD1.iso How reproducible: Consistently. Steps to Reproduce: 1. Create VMs over posixfs storagedomain 2. Plug out the disk in one of the RHS brick VMs 3. Actual results: Storage domain is down. Expected results: Storage domain should be up since its replica pair is up. Additional info: RHS Volume info: # gluster volume info vm-store Volume Name: vm-store Type: Distributed-Replicate Volume ID: 3a6e5c4e-f684-4559-a4ec-e6eeb5ef2b17 Status: Started Number of Bricks: 6 x 2 = 12 Transport-type: tcp Bricks: Brick1: 10.70.37.45:/brick1/s1 Brick2: 10.70.37.79:/brick1/s2 Brick3: 10.70.37.45:/brick2/s3 Brick4: 10.70.37.79:/brick2/s4 Brick5: 10.70.37.45:/brick3/s5 Brick6: 10.70.37.79:/brick3/s6 Brick7: 10.70.37.81:/brick1/s7 Brick8: 10.70.37.82:/brick1/s8 Brick9: 10.70.37.81:/brick2/s9 Brick10: 10.70.37.82:/brick2/s10 Brick11: 10.70.37.81:/brick3/s11 Brick12: 10.70.37.82:/brick3/s12 Options Reconfigured: performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: on storage.owner-uid: 36 storage.owner-gid: 36 Recovery: We were able to recover the posixfs storagedomain by rebooting the hypervisor.
There is an additional issue of the RHS VMs hitting a kernel panic when the disk is plugged out. This is not the root disk of the VM but the additional disk. Also, all the disks in the VMs have to be plugged back for the storagedomain to come up and to restart the VMs.
Created attachment 702849 [details] Screenshot of the kernel panic
Brian, Can you please look into this?
Hi Anush, It probably isn't a factor, but just to be sure... is the NFS data center a native NFS server with local storage on some remote node, or is that gluster based NFS (on what kind of volume)? Also, what type of virtual disks are fed into the RHS VMs for the data center volume? I suspect the panic is a factor here. This converts a lost disk to a lost server. Looking at the vm-store volume, it appears we should survive losing a server, so perhaps there are multiple issues here (i.e., losing the server on a disk pull, losing the data center on a server loss).
I just ran a quick test on a single RHS VM to deactivate a mounted lvm volume (e.g., brick) and didn't reproduce a panic. Could you also provide the kernel version of the RHS nodes to make sure I'm testing the correct bits? This is on 2.6.32-220.31.1.el6.x86_64.
Never mind, the panic actually does occur when I reattach the drive. This particular behavior looks like a duplicate of bug 867280, which I believe has a fix. We may need the fix in 6.2.z to address this problem (though the use case is not clear to me, as the disk should not disappear from a guest without user interaction).
Ok, I configured a somewhat similar setup: - RHEV datacenter on native RHS. This is a posixfs datacenter rather than NFS, since I already had it configured. I suspect this is not a factor, as this remains static during the test (but I could be wrong of course). - 4 RHS VMs allocated on the native datacenter, 300GB secondary VDs per VM. - Created and exposed a 6x2 dht+rep volume across the 4 RHS VMs as a new "virtual" datacenter. - Allocated a rhel6.3 VM on the virtual datacenter. ... and have run some basic tests based on the virtblk panic: - Crash a host not serving the active rhel6.3 VM. - Crash a non-domain host serving one leg of the replica set for the active VM (while running dbench in the VM). - Recover the non-domain host while running dbench in the VM. (Repeating this a second time did not result in the panic, but hung the RHS host, in turn leading to hung task messages in the guest until I powered off the RHS VM). - Crash the domain host (e.g., host with IP of the mount point fed into RHEVM) while running dbench in the VM. Reproduced a latency spike for a period of time (60s). - Recover the domain host while running dbench in the VM. - Bring down the rhev host associated with the virt domain, bring it back up. At no point during these tests did I lose the storage domain. The domain also returns without issue when I cycle the associated rhev host in and out of maintenance mode.
Targeting for 2.1.z (Big Bend) U1.
Considering its been 9months since we last tested this, can we re-test this ?
Assigning the NEEDINFO on Anush to myself. Let me try this case and furnish the result.
Clearing needinfo flag since this bug is ON_QA for verification.
Per integration bug triage 12/13: Action item if this fails QE: 1. Remove "Corbett" from Internal Whiteboard. 2. Remove "zstream" from Keyword 3. Remove 2.1.z flag 4. set rhs-future?
Tested with glusterfs-3.4.0.57rhs-1.el6rhs ISO - RHSS-2.1-20140116.2 RHEV - IS 32.2 (3.3.0-0.45.el6ev) Tested the scenarios as suggested by Brian in comment 10 With RHSS 2.1 Update2, client side quorum is enabled. So, with replica 2 volume, bringing down the first brick makes the volume go read-only, there by pausing the App VMs. So all the below cases are done by bringing down the second node of a replica pair or by killing the second brick of the replica pair. And in all the above case, 'dd' was running continuously Test-1 1. In a 4 node cluster, created a distribute replica volume of 2X2 2. Optimized the volume for virt store (i.e) gluster volume set <vol-name> group virt gluster volume set <vol-name> storage.owner-uid 36 gluster volume set <vol-name> storage.owner-uid 36 3. Started the volume 4. Created a Data domain with the above created gluster volume 5. Created a App VM with its root disk of size 20GB on the above created data domain. 6. Installed App VM with RHEL 6.5 7. After reboot, started running 'dd' command in loop inside VM (i.e) dd if=/dev/urandom of=/home/file1 bs=4k count=1000 8. Brought down the second of the replica pair VMs are up and running. Test-2 1. In a 4 node cluster, created a distribute replica volume of 2X2 2. Optimized the volume for virt store (i.e) gluster volume set <vol-name> group virt gluster volume set <vol-name> storage.owner-uid 36 gluster volume set <vol-name> storage.owner-uid 36 3. Started the volume 4. Created a Data domain with the above created gluster volume 5. Created a App VM with its root disk of size 20GB on the above created data domain. 6. Installed App VM with RHEL 6.5 7. After reboot, started running 'dd' command in loop inside VM (i.e) dd if=/dev/urandom of=/home/file1 bs=4k count=1000 8. Killed the brick process(glusterfsd) of the second brick in replica pair VMs are up and running. Test-3 ------- 1. In a 4 node cluster, created a distribute-replicate volume of 6X2 2. Optimized the volume for virt store (i.e) gluster volume set <vol-name> group virt gluster volume set <vol-name> storage.owner-uid 36 gluster volume set <vol-name> storage.owner-uid 36 3. Started the volume 4. Created a Data domain with the above created gluster volume 5. Created a App VM with its root disk of size 20GB on the above created data domain. 6. Installed App VM with RHEL 6.5 7. After reboot, started running 'dd' command in loop inside VM (i.e) dd if=/dev/urandom of=/home/file1 bs=4k count=1000 8. Plugged out one of the disk in the RHSS Node and that is not serving any volume. This can be done from RHEVM by deactivating the disk There were no problems and all went well. App VM was healthy Test-4 ------- This is the actual case, where the disk serving the brick was unplugged 1. In a 4 node cluster, created a distribute-replicate volume of 6X2 2. Optimized the volume for virt store (i.e) gluster volume set <vol-name> group virt gluster volume set <vol-name> storage.owner-uid 36 gluster volume set <vol-name> storage.owner-uid 36 3. Started the volume 4. Created a Data domain with the above created gluster volume 5. Created a App VM with its root disk of size 20GB on the above created data domain. 6. Installed App VM with RHEL 6.5 7. After reboot, started running 'dd' command in loop inside VM (i.e) dd if=/dev/urandom of=/home/file1 bs=4k count=1000 8. Plugged out one of the disk serving brick for the volume This can be done from RHEVM by deactivating the disk Result : There were immediate error messages in RHSS Node console as follows : [Sat Jan 18 09:50:16 UTC 2014 root.37.187:~ ] # Message from syslogd@rhss2 at Jan 18 05:03:57 ... glusterfsd: [2014-01-18 10:03:57.898358] M [posix-helpers.c:1313:posix_health_check_thread_proc] 0-drstore-posix: health-check failed, going down Message from syslogd@rhss2 at Jan 18 05:04:27 ... glusterfsd: [2014-01-18 10:04:27.898560] M [posix-helpers.c:1318:posix_health_check_thread_proc] 0-drstore-posix: still alive! -> SIGTERM Log messages in /var/log/messages' on the same node. <snip> Jan 18 05:03:55 rhss2 multipathd: vdb: remove path (uevent) Jan 18 05:03:55 rhss2 kernel: virtio-pci 0000:00:07.0: PCI INT A disabled Jan 18 05:03:56 rhss2 kernel: XFS (dm-2): metadata I/O error: block 0x6400600 ("xlog_iodone") error 19 buf count 262144 Jan 18 05:03:56 rhss2 kernel: XFS (dm-2): xfs_do_force_shutdown(0x2) called from line 1052 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa02d c011 Jan 18 05:03:56 rhss2 kernel: XFS (dm-2): Log I/O Error Detected. Shutting down filesystem Jan 18 05:03:56 rhss2 kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s) Jan 18 05:03:57 rhss2 glusterfsd: [2014-01-18 10:03:57.898358] M [posix-helpers.c:1313:posix_health_check_thread_proc] 0-drstore-posix: health-check failed, going down Jan 18 05:04:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:04:27 rhss2 glusterfsd: [2014-01-18 10:04:27.898560] M [posix-helpers.c:1318:posix_health_check_thread_proc] 0-drstore-posix: still alive! -> SIGTERM Jan 18 05:04:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:05:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:05:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:06:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:06:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:07:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:07:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:08:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:08:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:09:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:09:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:10:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:10:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:11:22 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jan 18 05:11:52 rhss2 kernel: XFS (dm-2): xfs_log_force: error 5 returned. </snip> After all this, App VM was running healthy and no issues were seen. Test-5 ------- As the continuation of the previous test. 1. Plugged in the disk again to the RHSS Node ( this was the disk which was unplugged in previous test ) This can be done by activating the disk from RHEVM UI 2. Again all went well, App VM was healthy and no other issues But the disk, was earlier shown as 'vdb' now shown as 'vdf' and that caused little confusion. Rebooted the RHSS Node and all came back well. 3. Triggered self-heal and it went on smooth and completed successfully Based on all this results, marking this bugs as VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html