Description of problem: VM's running in a RHV environment with glusterfs SD (not RHHI deployment) exhibit file system corruption. The files in the VM are truncated and/or filled with ASCII-0 (null) characters. The problem appears after a lot of i/o writes to a VM disk image that is qcow format and on a glusterfs Storage Domain. The qcow image can either be in the base volume of the disk or an image created via a snapshot. This is not seen when a NFS storage domain is used. Also not seen if the volume is in raw format, i.e. the disk has no snapshots and is preallocated. The file system inside the guest is xfs. Version-Release number of selected component (if applicable): Problem seen on both 4.3 and 4.2 versions. ~~~ ovirt-engine-4.3.4.3-0.1.el7.noarch glusterfs-3.12.2-47.el7.x86_64 qemu-kvm-rhev-2.12.0-18.el7_6.5.x86_64 libvirt-client-4.5.0-23.el7.x86_64 ~~~ ~~~ ovirt-engine-4.2.8.7-0.1.el7ev.noarch glusterfs-3.12.2-47.el7.x86_64 qemu-kvm-rhev-2.12.0-18.el7_6.4.x86_64 libvirt-4.5.0-10.el7_6.7.x86_64 ~~~ How reproducible: 100% Steps to Reproduce: 1. On RHV 4.3 or 4.2 environment attach a glusterfs Storage domain. 2. Spin up a VM that is a template-based thin-provisioned disk. The base image of the VM will be qcow 3. Run 'rpm -Va' and redirect to a file. 4. Then install lots of packages to simulate writes. 5. Redirect another 'rpm -Va' to a file. 6. Compare (diff) the 'rpm -Va' output in both files. Actual results: # diff --side-by-side --suppress-common-lines rpm_va_after_new_vm rpm_va_after_lots_pkg_install > S.5....T. c /etc/sysconfig/authconfig > ..5...... /usr/share/i18n/charmaps/ANSI_X3.4-1968.gz > ....L.... c /etc/pam.d/fingerprint-auth > ....L.... c /etc/pam.d/password-auth > ....L.... c /etc/pam.d/postlogin > ....L.... c /etc/pam.d/smartcard-auth > ....L.... c /etc/pam.d/system-auth > ..5...... /usr/share/locale/wa/LC_MESSAGES/atk10.mo > ..5...... /usr/share/locale/xh/LC_MESSAGES/atk10.mo > ..5...... /usr/share/locale/zh_CN/LC_MESSAGES/atk10.mo > ..5...... /usr/share/locale/zu/LC_MESSAGES/atk10.mo S.5....T. c /etc/sysconfig/authconfig < ....L.... c /etc/pam.d/fingerprint-auth < ....L.... c /etc/pam.d/password-auth < ....L.... c /etc/pam.d/postlogin < ....L.... c /etc/pam.d/smartcard-auth < ....L.... c /etc/pam.d/system-auth < ~~~ The following files are in question: ~~~ > ..5...... /usr/share/locale/wa/LC_MESSAGES/atk10.mo > ..5...... /usr/share/locale/xh/LC_MESSAGES/atk10.mo > ..5...... /usr/share/locale/zh_CN/LC_MESSAGES/atk10.mo > ..5...... /usr/share/locale/zu/LC_MESSAGES/atk10.mo ~~~ - ls -al and mtime info ~~~ # ls -al /usr/share/locale/wa/LC_MESSAGES/atk10.mo /usr/share/locale/xh/LC_MESSAGES/atk10.mo /usr/share/locale/zh_CN/LC_MESSAGES/atk10.mo /usr/share/locale/zu/LC_MESSAGES/atk10.mo -rw-r--r--. 1 root root 3803 May 31 2018 /usr/share/locale/wa/LC_MESSAGES/atk10.mo -rw-r--r--. 1 root root 8380 May 31 2018 /usr/share/locale/xh/LC_MESSAGES/atk10.mo -rw-r--r--. 1 root root 9533 May 31 2018 /usr/share/locale/zh_CN/LC_MESSAGES/atk10.mo -rw-r--r--. 1 root root 6039 May 31 2018 /usr/share/locale/zu/LC_MESSAGES/atk10.mo ~~~ - Files have 00 ~~~ # hexdump -C /usr/share/locale/wa/LC_MESSAGES/atk10.mo 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000ed0 00 00 00 00 00 00 00 00 00 00 00 |...........| 00000edb # hexdump -C /usr/share/locale/xh/LC_MESSAGES/atk10.mo 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000020b0 00 00 00 00 00 00 00 00 00 00 00 00 |............| 000020bc # hexdump -C /usr/share/locale/zh_CN/LC_MESSAGES/atk10.mo 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00002530 00 00 00 00 00 00 00 00 00 00 00 00 00 |.............| 0000253d # hexdump -C /usr/share/locale/zu/LC_MESSAGES/atk10.mo 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00001790 00 00 00 00 00 00 00 |.......| 00001797 ~~~ Expected results: ~~~ Files shouldn't be truncated and/or filled with ASCII-0 (null) characters. ~~~ Additional info:
sync2jira
Closed CURRENTRELEASE based on oVirt bz#1701736, in 4.3.5.