Description of problem: ----------------------- Configured sharded ( shard-block-size=512MB ) replica 3 gluster volume to store VM images. VM errors out while booting from the image, sometimes even the installation of OS errors out Version-Release number of selected component (if applicable): -------------------------------------------------------------- RHEL 7.2 RHGS 3.2.0 interim build ( glusterfs-3.8.4-1.el7rhgs ) qemu-kvm-1.5.3-105.el7_2.7.x86_64 qemu-img-1.5.3-105.el7_2.7.x86_64 libvirt-1.2.17-13.el7_2.5.x86_64 How reproducible: ----------------- Always ( 7/7 ) Steps to Reproduce: ------------------- 1. Create a gluster replica 3 volume. 2. Enable sharding with shard block size set to 512MB 3. Enable compound-fops on the volume 4. Create a raw image file on the volume using qemu-img + libgfapi 5. Install & boot a VM using libgfapi Actual results: --------------- VM errors out while booting from installed os. Sometimes installation itself errors out Expected results: ----------------- VMs should boot successfully Additional Info ---------------- 1. I have turned off compound fops and I am not seeing this problem 2. This issue is also seen with fuse mount too
Errors as seen in one of the brick logs: <snip> [2016-09-28 09:28:29.748305] E [inodelk.c:304:__inode_unlock_lock] 0-repvol-locks: Matching lock not found for unlock 0-9223372036854775807, by b47312255f7f0000 on 0x7fa408104500 [2016-09-28 09:28:29.748390] E [MSGID: 115053] [server-rpc-fops.c:313:server_finodelk_cbk] 0-repvol-server: 76262: FINODELK -2 (d8a6d740-b41b-417d-bb5c-4fa90cb43260) ==> (Invalid argument) [Invalid argument] [2016-09-28 09:28:36.034630] E [inodelk.c:304:__inode_unlock_lock] 0-repvol-locks: Matching lock not found for unlock 0-9223372036854775807, by 107212255f7f0000 on 0x7fa408104500 [2016-09-28 09:28:36.034702] E [MSGID: 136002] [decompounder.c:370:dc_finodelk_cbk] 0-repvol-decompounder: fop number 2 failed. Unwinding. [Invalid argument] [2016-09-28 09:28:36.034740] E [MSGID: 115090] [server-rpc-fops.c:2087:server_compound_cbk] 0-repvol-server: 78096: COMPOUND-2 (d8a6d740-b41b-417d-bb5c-4fa90cb43260) ==> (Invalid argument) [Invalid argument] [2016-09-28 09:29:14.755726] E [inodelk.c:304:__inode_unlock_lock] 0-repvol-locks: Matching lock not found for unlock 0-9223372036854775807, by 287312255f7f0000 on 0x7fa408104500 [2016-09-28 09:29:14.755933] E [MSGID: 115090] [server-rpc-fops.c:2087:server_compound_cbk] 0-repvol-server: 80092: COMPOUND-2 (d8a6d740-b41b-417d-bb5c-4fa90cb43260) ==> (Invalid argument) [Invalid argument] [2016-09-28 09:29:44.759190] E [inodelk.c:304:__inode_unlock_lock] 0-repvol-locks: Matching lock not found for unlock 0-9223372036854775807, by 287312255f7f0000 on 0x7fa408104500 [2016-09-28 09:29:44.759254] E [MSGID: 115090] [server-rpc-fops.c:2087:server_compound_cbk] 0-repvol-server: 80327: COMPOUND-2 (d8a6d740-b41b-417d-bb5c-4fa90cb43260) ==> (Invalid argument) [Invalid argument] The message "E [MSGID: 136002] [decompounder.c:370:dc_finodelk_cbk] 0-repvol-decompounder: fop number 2 failed. Unwinding. [Invalid argument]" repeated 2 times between [2016-09-28 09:28:36.034702] and [2016-09-28 09:29:44.759245] [2016-09-28 09:30:39.254643] E [inodelk.c:304:__inode_unlock_lock] 0-repvol-locks: Matching lock not found for unlock 0-9223372036854775807, by 287312255f7f0000 on 0x7fa408104500 [2016-09-28 09:30:39.254714] E [MSGID: 136002] [decompounder.c:370:dc_finodelk_cbk] 0-repvol-decompounder: fop number 2 failed. Unwinding. [Invalid argument] [2016-09-28 09:30:39.254824] E [MSGID: 115090] [server-rpc-fops.c:2087:server_compound_cbk] 0-repvol-server: 80392: COMPOUND-2 (d8a6d740-b41b-417d-bb5c-4fa90cb43260) ==> (Invalid argument) [Invalid argument] </snip>
Created attachment 1205415 [details] brick_log tar-ed
Created attachment 1205416 [details] fuse mount logs tar-ed
Created attachment 1205417 [details] qemu_logs of the VM
I have missed some more information which I think would be worth for debugging 1. VM name is vm2 2. The image file that went corrupted is vm2.img 3. The size of the image file 30G 4. The volume (repvol) has 2 images ( vm2.img and cvm2.img )
There's a file corruption issue with compound fops that I'm currently debugging - https://bugzilla.redhat.com/show_bug.cgi?id=1378778#c1 This *could* be related to the same issue. Will post what I find once I have the RC.
Moving this bug to POST state as the upstream patch which was sent against BZ 1378778 is available for review at http://review.gluster.org/#/c/15654/
upstream mainline : http://review.gluster.org/#/c/15654/ upstream 3.9 : http://review.gluster.org/#/c/15709/ downstream patch : https://code.engineering.redhat.com/gerrit/#/c/87999/
Tested with RHGS 3.2.0 interim build ( glusterfs-3.8.4-3.el7rhgs ) 0. Created a replica 3 volume and optimized for vm store 1. Created VM image file on the fuse mount 2. Installed RHEL 7.2 on the application VM VM could successfully boot and there were no corruption of VM images
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html