Bug 1534403 - Severe filesystem corruption with virtio and sharding. 100% reproducible
Summary: Severe filesystem corruption with virtio and sharding. 100% reproducible
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: mainline
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: Niels de Vos
QA Contact: bugs@gluster.org
Depends On:
TreeView+ depends on / blocked
Reported: 2018-01-15 07:21 UTC by Luca Lazzeroni
Modified: 2020-03-12 12:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2020-03-12 12:46:23 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)

Description Luca Lazzeroni 2018-01-15 07:21:11 UTC
I'm experiencing a severe and always reproducible filesystem corruption with virtio drivers used inside a guest.
My setup is this:

3 nodes
1 volume in replica 3 arbiter 1 mode
"virt" group applied to volume
Gluster 3.12.4 from Centos repositories
KVM 2.9.0 compiled from RHEV sources
Guest machine using virtio drivers (Ubuntu 16.04.3 LTS server)

To reproduce the problem it's enough to:

1) Create a virtual machine with a virtual disk (better if QCOW2 format is used, but the problem is there also in "raw" disk format) with virtio access
2) Try to install the Ubuntu 16.04.3 setup
3) After a while the setup fails with strange error; by inspecting the QCOW2 volume I found many files are full of null bytes
4) It seems that the problem appears after a bounce of write operations.

I've tested many times and the result is the same.

If, instead, I select a SCSI disk, everything works flawlessly.

Thank you for your help,

Comment 1 Luca Lazzeroni 2018-01-15 07:22:33 UTC
I should correct myself. I can see the problem with SCSI driver and QCOW2 disk too.

Comment 2 Luca Lazzeroni 2018-01-15 08:09:29 UTC
Problem seems related to libgfapi because if I use the volume by NFS (instead of direct access via libgfapi) everything works. Tested 2 times.

Comment 3 Luca Lazzeroni 2018-01-15 09:32:19 UTC
I've tested it with a volume on a FUSE mounted filesystem (eg: the host mount via FUSE the gluster volume and KVM use the QCOW2 disk on the mounted volume) and it works too.

Comment 4 Shyamsundar 2018-10-23 14:53:50 UTC
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 5 Worker Ant 2020-03-12 12:46:23 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/944, and will be tracked there from now on. Visit GitHub issues URL for further details

Note You need to log in before you can comment on or make changes to this bug.